Robust String Split

Posted on
Thu Dec 13, 2018 7:25 am
DaveL17 offline
User avatar
Posts: 6742
Joined: Aug 20, 2013
Location: Chicago, IL, USA

Robust String Split

I have a situation where I need to inject a placeholder in a string so that I can split on it later. Think something like this:

Code: Select all
s = "string_1" + some_value + "string_2"
t = "string_3" + some_value + "string_4"

# later that day
s1 = split(s, some_value)
t1 = split(t, some_value)
Strings 1, 3 are device ids.
Strings 2,4 are device state names.

So I want to be reasonably sure that some_value isn't something that could appear in a state name (which would hose the split). The state names could come from any Indigo device, so I'm not in control of those.

My question is, what would folks do to ensure that 'some_value' is something unique that (at least probably) won't appear in string_2 or string_4?

I came here to drink milk and kick ass....and I've just finished my milk.

[My Plugins] - [My Forums]

Posted on
Thu Dec 13, 2018 8:06 am
RogueProeliator offline
User avatar
Posts: 2501
Joined: Nov 13, 2012
Location: Baton Rouge, LA

Re: Robust String Split

So I want to be reasonably sure that some_value isn't something that could appear in a state name (which would hose the split). The state names could come from any Indigo device, so I'm not in control of those.

My question is, what would folks do to ensure that 'some_value' is something unique that (at least probably) won't appear in string_2 or string_4?

I can think of a couple of techniques that people use to accomplish what you want, but there may be a simple solution IF the values you mention are always State names and not values. If that is the case, simply choose your splitting values as characters or strings of characters which are illegal in state names. I don't recall offhand what the requirements of the name are, though. It is in an XML file, though, so most likely would be safe using something with & in it unless someone really chose to escape their state name. :)

Posted on
Thu Dec 13, 2018 8:29 am
DaveL17 offline
User avatar
Posts: 6742
Joined: Aug 20, 2013
Location: Chicago, IL, USA

Re: Robust String Split

RogueProeliator wrote:
If that is the case, simply choose your splitting values as characters or strings of characters which are illegal in state names.

Oooh, what a good idea. It seems that a '/' can't appear anywhere in a state name, so that should work.

Thanks!

I came here to drink milk and kick ass....and I've just finished my milk.

[My Plugins] - [My Forums]

Posted on
Thu Dec 13, 2018 10:01 am
jay (support) offline
Site Admin
User avatar
Posts: 18199
Joined: Mar 19, 2008
Location: Austin, Texas

Re: Robust String Split

The substitution markup can give you a clue:

Code: Select all
%%d:id:statekey%%


Colons can't be in state names. The general guideline is that state keys must start with a character then can contain letters and numbers. I don't think anything else is allowed.

The regex for the device substitution is:

Code: Select all
re.compile("\%%d:([0-9]+):([A-z0-9]+)%%")


so the match list returned has the ID as the first element and the statekey as the second.

Jay (Indigo Support)
Twitter | Facebook | LinkedIn

Posted on
Thu Dec 13, 2018 12:18 pm
howartp offline
Posts: 4559
Joined: Jan 09, 2014
Location: West Yorkshire, UK

Re: Robust String Split

Do s and t have to be stored as strings?

Couldn’t they be dicts/arrays/lists (I have to look them all up to get right one but you know what I mean!) then you just index the values you want without worrying about substituting a state delimiter?


Sent from my iPhone using Tapatalk Pro

Posted on
Thu Dec 13, 2018 4:08 pm
DaveL17 offline
User avatar
Posts: 6742
Joined: Aug 20, 2013
Location: Chicago, IL, USA

Re: Robust String Split

jay (support) wrote:
The substitution markup can give you a clue:

Code: Select all
%%d:id:statekey%%


Colons can't be in state names. The general guideline is that state keys must start with a character then can contain letters and numbers. I don't think anything else is allowed.

The regex for the device substitution is:

Code: Select all
re.compile("\%%d:([0-9]+):([A-z0-9]+)%%")


so the match list returned has the ID as the first element and the statekey as the second.

Thanks Jay. The forward slash seems to be working, but a colon would probably work just as well.

I came here to drink milk and kick ass....and I've just finished my milk.

[My Plugins] - [My Forums]

Posted on
Thu Dec 13, 2018 4:26 pm
DaveL17 offline
User avatar
Posts: 6742
Joined: Aug 20, 2013
Location: Chicago, IL, USA

Re: Robust String Split

howartp wrote:
Do s and t have to be stored as strings?

They do.

The use case is for a list of tuples for a dynamic config menu box where, in this instance, I'm returning (although not precisely how I'm doing it):
Code: Select all
[("string_1" + "/" + "string_2", "string_2"), ("string_3" + "/" + "string_4", "string_4")]
or

Code: Select all
[(u'12345678/pressure', u'pressure'), (u'12345679/temperature', u'temperature')]
This is so I can identify the target device ID and state name in one go from whatever is selected by the user. If the user selects the 'temperature' menu option, I need to also pass the device ID that contains the state selected.

So I can do:
Code: Select all
choice_id, choice_state = valuesDict['menu_item'].split('/')
I tried storing the chosen option as a stringified tuple/list/dict and none of them seemed to work. The option suggested works great.

I came here to drink milk and kick ass....and I've just finished my milk.

[My Plugins] - [My Forums]

Posted on
Sat Dec 15, 2018 8:35 am
kw123 offline
User avatar
Posts: 8333
Joined: May 12, 2013
Location: Dallas, TX

Re: Robust String Split

A bit late. But here how I do it

Use eg “xxcc:1;ccbb” as the Splitter.
Very unlikely that that that string would ever appear in any name or text.




Sent from my iPhone using Tapatalk

Posted on
Sat Dec 15, 2018 10:46 am
DaveL17 offline
User avatar
Posts: 6742
Joined: Aug 20, 2013
Location: Chicago, IL, USA

Re: Robust String Split

kw123 wrote:
A bit late. But here how I do it

Use eg “xxcc:1;ccbb” as the Splitter.
Very unlikely that that that string would ever appear in any name or text.

Just as an FYI, the forward slash solution seems to work well and seemingly CANNOT be used in a state name. Gets rid of the "unlikely" bit. :D

As it turns out, I didn't need to use a split after all for this particular effort, but this will be good for some other spots that now I have to go chase down.

I came here to drink milk and kick ass....and I've just finished my milk.

[My Plugins] - [My Forums]

Page 1 of 1

Who is online

Users browsing this forum: No registered users and 0 guests