Multiprocessing and Indigo

Posted on
Sun Nov 15, 2020 1:46 pm
DaveL17 offline
User avatar
Posts: 6753
Joined: Aug 20, 2013
Location: Chicago, IL, USA

Re: Multiprocessing and Indigo

I wanted to use threads, but in this instance I'm trying to code around a memory leak in Matplotlib and I'm doing that by positively killing the process when it's finished. My understanding is that affirmatively killing threads is fraught with peril (if not impossible).

I came here to drink milk and kick ass....and I've just finished my milk.

[My Plugins] - [My Forums]

Posted on
Sun Nov 15, 2020 6:03 pm
matt (support) offline
Site Admin
User avatar
Posts: 21417
Joined: Jan 27, 2003
Location: Texas

Re: Multiprocessing and Indigo

I think I narrowed down the problem a bit more. The multiprocessing module on Python 2.7 always forks the main process (which is the Indigo Plugin Host process) for the children. Those children get copies of all the memory/objects that the parent had. So in this case pickling/unpickling doesn't actually occur – everything just gets cloned via the fork(). A downside of using fork() is that it is notorious for causing deadlocks especially if the process has multiple threads (which Indigo Plugin Host does internally), because the thread lock states will also get cloned in the forked child processes.

One way around this problem is to have multiprocessing use a different method via calling multiprocessing.set_start_method("spawn"). This will force a clean execution environment on the child process and the arguments will instead be pickled/unpickled, which for any Indigo based objects (devices, schedules, indigo.Dict, indigo.List, or even the indigo plugin instance itself) will throw an exception. That would be advantageous since the developer would get an exception immediately when trying to start the child process instead of getting a random deadlock some of the time. However, multiprocessing.set_start_method doesn't exist in Python 2.7 (v3+ only), so we are stuck with it always using the fork() method which is at risk of deadlocks.

I believe that not using any Indigo APIs or objects in the multiprocess instance methods will avoid the deadlocks though. I'm not 100% sure as there might still be some risk of deadlocking internally in the Indigo Plugin Host plumbing but if no APIs are calling into it I hope the risk is low.

Image

Posted on
Mon Nov 16, 2020 9:18 pm
DaveL17 offline
User avatar
Posts: 6753
Joined: Aug 20, 2013
Location: Chicago, IL, USA

Re: Multiprocessing and Indigo

I'm testing out subprocess to overcome the limitations of multiprocessing under Python 2.7 (and probably 3.x) and will post things that will hopefully be helpful in using this approach. I plan to add to this as I go. Please chime in if anything is mischaracterized or inaccurate, or with other suggestions.

SUBPROCESS ARGUMENTS MUST BE STRINGS
My intention is to use subprocess.Popen() and do the work I need to do in the child processes. subprocess.Popen() requires that all passed arguments be strings, so it's not possible to pass objects like a function call or indigo object (like a dev instance or an indigo.Dict() object). Instead, it's necessary to serialize these data. As Matt noted above, Indigo objects can't be serialized so they need to be converted first. For example, convert an indigo.Dict() to a Python dict and then serialize it:

Code: Select all
#! /usr/bin/env python
# -*- coding: utf-8 -*-

import pickle

dev = indigo.devices[66959601]

dev_props = dev.ownerProps

dev_dict = dict(dev_props)  # You can't pickle an indigo.Dict() object, so convert it first.

dev_pickle = pickle.dumps(dev_dict)

indigo.server.log(repr(dev_pickle))  # A

dev_unpickle = pickle.loads(dev_pickle)

indigo.server.log(repr(dev_unpickle))  # B
Yields:

Code: Select all
   Script                          '(dp0\nVdayOfWeekFormat\np1\nVmid\np2\nsVfirstDayOfWeek\np3\nV6\np4\nsVdayOfWeekAlignment\np5\nVright\np6\nsVfileName\np7\nVchart_calendar.png\np8\nsVcustomSizeHeight\np9\nVNone\np10\nsVfontSize\np11\nI12\nsVcalendarGrid\np12\nI01\nsVisChart\np13\nI01\nsVrefreshInterval\np14\nV60\np15\nsVcustomSizeWidth\np16\nVNone\np17\ns.'  # A
   Script                          {u'customSizeHeight': u'None', u'firstDayOfWeek': u'6', u'dayOfWeekAlignment': u'right', u'fileName': u'chart_calendar.png', u'dayOfWeekFormat': u'mid', u'fontSize': 12, u'calendarGrid': True, u'isChart': True, u'refreshInterval': u'60', u'customSizeWidth': u'None'}  # B
Note that pickle handles strings, ints, bools no problem (the 'None' string in the example output is actually the string u'None' not the literal None type.) So whatever we want to do in the subprocess will require that script to "stand on its own". In other words, even if you've imported something into plugin.py, you'll also need to import it in the subprocess script, too.

LIMITED I/O IS POSSIBLE
Or more precisely, it's possible to send data back to the plugin from the spawned process. This simple example sends data to the child process and the child sends data back to the parent.

test1.py (the child)
Code: Select all
#! /usr/bin/env python
# -*- coding: utf-8 -*-

import sys
import pickle

pickle.dump(sys.argv, sys.stdout)  # This is the data returned to the parent if everything goes well

raise ValueError(u"This is the error text to display.")
test2.py (the parent)
Code: Select all
#! /usr/bin/env python
# -*- coding: utf-8 -*-

import subprocess
import pickle

path_to_file = '/Users/Dave/PycharmProjects/development/test1.py'
proc = subprocess.Popen(['python2.7', path_to_file, 'foo', 'bar',],
                  stdout=subprocess.PIPE,
                  stderr=subprocess.PIPE,
                  )

# Note reply is a pickle, err is a string
reply, err = proc.communicate()

reply = pickle.loads(reply)

print(repr(reply))  # A
print(type(reply))  # B

print(err)  # C

print(proc.returncode)  # D
Output returned by test1.py:
Code: Select all
['/Users/Dave/PycharmProjects/development/test1.py', 'foo', 'bar']  # A
<type 'list'>  # B
Traceback (most recent call last):
  File "/Users/Dave/PycharmProjects/development/test1.py", line 9, in <module>
    raise ValueError(u"This is the error text to display.")
ValueError: This is the error text to display.  # C

1  # D
Note that proc.returncode is a 1 in this example because test1.py raises a ValueError; otherwise, it would be 0 (zero).

EDIT: There's not a lot to be added to this based on my approach to be honest. The structure that I chose to go with has the following flow:
Code: Select all
plugin.py calls: proc = subprocess.Popen(['python2.7', path_to_file, payload, ], stdout=subprocess.PIPE, stderr=subprocess.PIPE )
path_to_file is a reference to a child script, say foo.py which is located in the plugin package
foo.py handles needed imports -- including other python scripts located within the plugin package -- even if they've already been imported into plugin.py
foo.py does its thing in its own process -- thereby "protecting" the plugin process
foo.py collects return data along the way and returns it in a pickle through stdout
I chose to send exception messages through stdout rather that stderr so that foo.py had a chance to finish.
Uncaught exceptions will come through stderr
Process return data and messages in plugin.py


So far this seems to be working like a champ.

I came here to drink milk and kick ass....and I've just finished my milk.

[My Plugins] - [My Forums]

Who is online

Users browsing this forum: No registered users and 4 guests