Indigo Domotics • View topic - to u"" or not to u"", that is the question

to u"" or not to u"", that is the question

Page 1 of 1

Posted on
Sun Mar 12, 2017 9:17 am
MartyS offline: Posts: 86; Joined: May 06, 2008; Location: Charlotte, North Carolina

to u"" or not to u"", that is the question

I've looked at several 3rd-party plugins, and at the Indigo 6 (even though I'm running 5) iTunes plugin for examples of when to use u"something" versus "something". I see mixed things so I just have to ask… "to be or not to be, that is the question."

Some of the examples I've seen:

k_name = u"value"
k_name = "value"

Code: Select all: prop[u"name"] = value prop["name"] = value self.debugLog(u"Log message") self.debugLog("Log message") list = [u"one", u"two", u"three"] list = ["one", "two", "three"]

To me, things that should not change, such as property keys references for checkboxes and hidden things from Devices.xml, don't need the u"" treatment. Values being entered into text fields with the UI do, unless the values are limited to only specific ones like in a validation list (which I don't think need the treatment either).

And if an entered value can contain unicode characters, how do I assure it's stored into and retrieved/logged from the pluginProps or other areas as such?

How important is it that log messages and other literals use the u"" format?

To actually test some of this. I created a device with unicode characters in its name and boy did I see errors in the Indigo log from my plugin! Any place logging is done for the entire Device object (such as in the following code) it fails, even if I try and unicode() it:

Code: Select all: print (u"device: %s" % (unicode(dev)))

So I experimented and get:

Code: Select all: >>> dev=indigo.devices[1190300210] >>> print dev Traceback (most recent call last): File "<console>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 9: ordinal not in range(128) >>> print unicode(dev) Traceback (most recent call last): File "<console>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 9: ordinal not in range(128) >>> print dev.name Traceback (most recent call last): File "<console>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 9: ordinal not in range(128) >>> print unicode(dev.name) Traceback (most recent call last): File "<console>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 9: ordinal not in range(128)

whereas on this smaller piece of the Device object I can make it work:

Code: Select all: >>> print dev.name.encode("utf-8")

So how do I wrap my debugging statements with references to the entire Device object to make them work? Or are device names supposed to be ASCII only (seems unlikely) so this never shows up in real-world cases?

I hope that the behavior isn't just a version 5 artifact that I'm seeing. I'd hate to waste too many of your brain cycles if that's the case.

Any advise would be appreciated.

/Marty

Posted on
Sun Mar 12, 2017 10:48 am
jay (support) offline: Site Admin; Posts: 18220; Joined: Mar 19, 2008; Location: Austin, Texas

Re: to u"" or not to u"", that is the question

Strings in Python 2 are a mess. In strings that are embedded you can use either unless you have explicit unicode in them. Everything Indigo touches is treated as a unicode. The print problem you're seeing is because it automatically calls the str() method, which is not unicode friendly. Try print unicode(dev) instead.

Jay (Indigo Support)
Twitter | Facebook | LinkedIn

Posted on
Sun Mar 12, 2017 12:13 pm
MartyS offline: Posts: 86; Joined: May 06, 2008; Location: Charlotte, North Carolina

Re: to u"" or not to u"", that is the question

jay (support) wrote:
Strings in Python 2 are a mess. In strings that are embedded you can use either unless you have explicit unicode in them. Everything Indigo touches is treated as a unicode. The print problem you're seeing is because it automatically calls the str() method, which is not unicode friendly. Try print unicode(dev) instead.

I understand the need for unicode, how its represented, etc. as I've worked with I18N for more years than I care to count. It's the Python manipulation/conversions that has me confused. That's why I looked at other plugins for examples.

I've tried print unicode(dev) as I shown in my first set of example outputs and get the exact same error:

Code: Select all: >>> print unicode(dev) Traceback (most recent call last): File "<console>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 9: ordinal not in range(128)

and the same happens with self.debugLog() and other logging methods. :cry:

Unfortunately, the Device object doesn't have an encode method — that's the only way I've had work to get to the device name (by itself).

Are device names allowed to be in unicode? Maybe I'm just hitting an edge case?

So when a text field from either the plugin or device config is presented to a plugin method it will be represented as u"value" and I can just shove it into self.whatever or a variable as-is to allow a later retrieval as a unicode object?

/Marty

Posted on
Sun Mar 12, 2017 12:22 pm
matt (support) offline: Site Admin; Posts: 21417; Joined: Jan 27, 2003; Location: Texas

Re: to u"" or not to u"", that is the question

Device names can be unicode. I'm not sure why the print unicode() case is failing. I just tested it myself here without an error:

Code: Select all: >>> print unicode(dev.name) ñánó

What specific device name are you trying to use?

Posted on
Sun Mar 12, 2017 12:31 pm
MartyS offline: Posts: 86; Joined: May 06, 2008; Location: Charlotte, North Carolina

Re: to u"" or not to u"", that is the question

matt (support) wrote:
Device names can be unicode. I'm not sure why the print unicode() case is failing. I just tested it myself here without an error:

Code: Select all
>>> print unicode(dev.name) ñánó

What specific device name are you trying to use?

The device's name I am using is "Heizung Büro Edgeschoss" without the quotes. It's a string I randomly picked up from a forum post.

Is this an issue that only happens in Python 2.5/2.6 perhaps? I'm at a loss.

/Marty

Posted on
Sun Mar 12, 2017 12:40 pm
matt (support) offline: Site Admin; Posts: 21417; Joined: Jan 27, 2003; Location: Texas

Re: to u"" or not to u"", that is the question

Ah, I didn't catch that it sounds like you aren't running Indigo 7. I just tried it under Indigo 7 with that name and did not get an error:

Code: Select all: >>> print unicode(dev.name) Heizung Büro Edgeschoss

I don't recall if the fix is because Indigo 7 uses python 2.7, or if we changed something. We definitely have made changes/fixes to Indigo's plugin architecture and how it handles unicode in some cases but I don't recall if that would explain the specific error you are seeing or not. My hunch is it was a change we made but I don't know what version it was made in.

Posted on
Sun Mar 12, 2017 12:43 pm
matt (support) offline: Site Admin; Posts: 21417; Joined: Jan 27, 2003; Location: Texas

Re: to u"" or not to u"", that is the question

Found it. That was fixed in Indigo 6.0 beta 2.

Posted on
Sun Mar 12, 2017 12:46 pm
MartyS offline: Posts: 86; Joined: May 06, 2008; Location: Charlotte, North Carolina

Re: to u"" or not to u"", that is the question

Thanks, Matt.

I don't have Indigo 7 to try, but I just tried with Indigo 6.1.11 and have the same results as with Indigo 5. Strange, and frustrating.

/Marty

Posted on
Sun Mar 12, 2017 12:49 pm
matt (support) offline: Site Admin; Posts: 21417; Joined: Jan 27, 2003; Location: Texas

Re: to u"" or not to u"", that is the question

Same exact error even when logging to the Event Log window? If so, from Indigo 6 can you copy/paste the Event Log window results of:

indigo.server.log(u"test1: dev name is" + dev.name)
indigo.server.log(u"test2: dev name is" + unicode(dev.name))

Posted on
Sun Mar 12, 2017 1:15 pm
MartyS offline: Posts: 86; Joined: May 06, 2008; Location: Charlotte, North Carolina

Re: to u"" or not to u"", that is the question

matt (support) wrote:
Same exact error even when logging to the Event Log window? If so, from Indigo 6 can you copy/paste the Event Log window results of:

indigo.server.log(u"test1: dev name is" + dev.name)
indigo.server.log(u"test2: dev name is" + unicode(dev.name))

Okay, that's a new one on me. I thought that print in the interactive console would be the same as the indigo.server.log but they aren't. Here's the output to the log using Indigo 6:

Code: Select all: Interactive Shell test1: dev name is Heizung Büro Edgeschoss Interactive Shell test1: dev name is Heizung Büro Edgeschoss

And I can output unicode(dev) as well.

Looks like my choices are:

Since there might only be a dozen users of the plugin anyway (I have no way of knowing) I'll need to ponder what direction to take.

/Marty

Posted on
Sun Mar 12, 2017 2:21 pm
jay (support) offline: Site Admin; Posts: 18220; Joined: Mar 19, 2008; Location: Austin, Texas

Re: to u"" or not to u"", that is the question

Try:

Code: Select all: print d.name.encode("latin1", errors="ignore")

That should replace any UTF8 characters with a "?". Close enough for the user to be able to identify the device if it's only logging that's an issue.

Jay (Indigo Support)
Twitter | Facebook | LinkedIn

Posted on
Mon Mar 13, 2017 3:01 am
MartyS offline: Posts: 86; Joined: May 06, 2008; Location: Charlotte, North Carolina

Re: to u"" or not to u"", that is the question

jay (support) wrote:
Try:

Code: Select all
print d.name.encode("latin1", errors="ignore")

That should replace any UTF8 characters with a "?". Close enough for the user to be able to identify the device if it's only logging that's an issue.

I had to change the line to:

Code: Select all: print d.name.encode("latin1", "ignore")

But other than that, the suggestion works—thanks! That would decrease the number of errors logged in v5, but the bigger issue is when I dump all of the Device object into the log which works in v6 but not v5 for unicode values.

I'm going to shelve the issue and go with what works for the majority and if someone needs a unicode device name in v5 with this plugin then I'll look back into it.

/Marty

Post a reply

Page 1 of 1

Who is online

Users browsing this forum: No registered users and 6 guests