Migrate SQLite to PostGreSQL

Posted on
Sun Jan 20, 2019 3:16 pm
CrazyFin offline
Posts: 381
Joined: Jan 08, 2015
Location: Stockholm, SWEDEN

Re: Migrate SQLite to PostGreSQL

Aaah! Could it the fact that I had not commented out the example "purgeDevice" and "skipDevice" settings as well as "testThisDeviceOnly" that were in your original script?!

I have now set
purgeDevice = ""
skipDevices = ""
testThisDeviceOnly = ""

and restarted the script so I have to wait about 30 minutes or more to see if it was those three settings causing the error.

Posted on
Sun Jan 20, 2019 3:55 pm
CrazyFin offline
Posts: 381
Joined: Jan 08, 2015
Location: Stockholm, SWEDEN

Re: Migrate SQLite to PostGreSQL

Nope did not help... :(

Still stops with same error on same line:

Code: Select all
stepCreateImport           89.2@22:14 creating POSTGRES import commands from SQLite querries   (  ~ 84 minutes for this database size )
....................      259.6@22:17  # 20
....................      507.5@22:21  # 40
....................      747.8@22:25  # 60
....................      958.1@22:28  # 80
....................     1588.3@22:39  # 100
.......Traceback (most recent call last):
  File "sql2postgre.py", line 1424, in <module>
    mainPGM()
  File "sql2postgre.py", line 274, in mainPGM
    SQLITE2POSTGRES(timeStart)
  File "sql2postgre.py", line 541, in SQLITE2POSTGRES
    recCount,recSkipped,purgeCount,x = stripPurgeRecords(x,fname,purgeRecordsWithSameTime,purgeDevice)
ValueError: need more than 3 values to unpack

Posted on
Sun Jan 20, 2019 4:30 pm
CrazyFin offline
Posts: 381
Joined: Jan 08, 2015
Location: Stockholm, SWEDEN

Re: Migrate SQLite to PostGreSQL

Ok, I re-ran the script with debug level = 5 and now it is clearer where it stops:
Code: Select all
CreateImport 1705.3@23:24 creating # 107/260-eventlog_history
Traceback (most recent call last):
  File "sql2postgre.py", line 1424, in <module>
    mainPGM()
  File "sql2postgre.py", line 274, in mainPGM
    SQLITE2POSTGRES(timeStart)
  File "sql2postgre.py", line 541, in SQLITE2POSTGRES
    recCount,recSkipped,purgeCount,x = stripPurgeRecords(x,fname,purgeRecordsWithSameTime,purgeDevice)
ValueError: need more than 3 values to unpack
So the error occurs when the script comes to the point where it will create eventlog_history record import commands?

Posted on
Sun Jan 20, 2019 4:53 pm
kw123 offline
User avatar
Posts: 8333
Joined: May 12, 2013
Location: Dallas, TX

Re: Migrate SQLite to PostGreSQL

could you post the script YOU are using .
use [ code ] to post

Karl

Posted on
Mon Jan 21, 2019 2:52 am
CrazyFin offline
Posts: 381
Joined: Jan 08, 2015
Location: Stockholm, SWEDEN

Re: Migrate SQLite to PostGreSQL

@kw123: My current version of your modified script at my DropBox: https://www.dropbox.com/s/jdimcyyx7cj4s ... re.py?dl=0

Ok after I did the following in the script:
skipDevices = ["eventlog_history"]

the script was able to finish the part where it creates the import commands:
Code: Select all
CreateImport 3782.4@00:38 creating # 260/260-device_history_771329309
purgeRecordsWithSameTime 3782.4@00:38 for fname dev_771329309 out of 60 records 0 skipped = 0.0%; datePurged: 0 = 0.0%
purgeRecordsWithSameTime 3782.5@00:38 for all records  out of 35787895 records 6889473 skipped =19.3%; datePurged: 0 = 0.0%
stepCreateImport         3782.5@00:38 FINISHED
And now it actually started the import but it looks like it ends very fast?
Code: Select all
stepDoIMPORT             3782.5@00:38 STARTED  this might take some time  (~214 minutes for this step ), # of records to be imported: 28898422
dropPostGresDB 3782.6@00:38 START
dropPostGresDB 3782.9@00:38 FINISHED
I do see the following error though:
Code: Select all
createPostGresDB 3782.9@00:38 START
createPostGresDB 3782.9@00:38 ERRROR msg: ERROR:  new collation (C) is incompatible with the collation of the template database (en_US.UTF-8)
HINT:  Use the same collation as in the template database, or use template0 as template.
I guess I need to change something in the encoding setting?

Import starts but it looks like it finishes directly?
Code: Select all
 importing 3782.9@00:38 ...
doPOSTGREupdate 3782.9@00:38 0/1; importing   bytes:8850797813, file:postgresimportFromCOPY-ALL
stepDoIMPORT             3782.9@00:38 FINISHED
And at this point I get an error alert when script is trying to remove a file:
Code: Select all
stepREFRESHquerry        3782.9@00:38 check for updates to SQLite since copy of DB was made
rm: /Users/dator/Documents/convert/temp/postgresUPDATE-*: No such file or directory
Script continues and checks for updates on all 260 tables:
Code: Select all
REFRESHquerry 3783.1@00:38 0/260 :device_history_1208422578
purgeRecordsWithSameTime 3783.1@00:38 for fname dev_1208422578 out of 1 records 0 skipped = 0.0%; datePurged: 0 = 0.0%
REFRESHquerry 3783.2@00:38 1/260 :device_history_1533782331
purgeRecordsWithSameTime 3783.2@00:38 for fname dev_1533782331 out of 1 records 0 skipped = 0.0%; datePurged: 0 = 0.0%
......
and then stops again when it tries to check table number 105 "eventlog_history" even though I had set the variable skipDevices = ["eventlog_history"]:
Code: Select all
REFRESHquerry 3791.6@00:39 105/260 :device_history_306539892
purgeRecordsWithSameTime 3791.6@00:39 for fname dev_306539892 out of 1 records 0 skipped = 0.0%; datePurged: 0 = 0.0%
REFRESHquerry 3791.7@00:39 106/260 :eventlog_history
Traceback (most recent call last):
  File "sql2postgre.py", line 1425, in <module>
    mainPGM()
  File "sql2postgre.py", line 275, in mainPGM
    SQLITE2POSTGRES(timeStart)
  File "sql2postgre.py", line 622, in SQLITE2POSTGRES
    recCount,recSkipped,purgeCount,x = stripPurgeRecords(x,fname,purgeRecordsWithSameTime,purgeDevice)
ValueError: need more than 3 values to unpack
The line with createPostGresDB command that failed with encoding error I have now added "TEMPLATE = 'template0'":
Code: Select all
   cmd=PGpath+"psql -U "+PGuid+" -c \"CREATE DATABASE "+DB+" WITH OWNER = "+PGuid+" ENCODING = 'UTF8' TABLESPACE = pg_default LC_COLLATE = 'C' LC_CTYPE = 'C' TEMPLATE = 'template0' CONNECTION LIMIT = -1;\""

It looks like the encoding issue when creating new databases is common and I found several threads on the web with solutions and comments to this issue. For example these two:
https://stackoverflow.com/questions/18870775/how-to-change-the-template-database-collection-coding and https://github.com/mozilla/chronicle/issues/367

After the change as above with "TEMPLATE = 'template0'" to the createPostGresDB command I have now restarted the script again... back soon when new results are available... :-)

Posted on
Mon Jan 21, 2019 4:36 am
CrazyFin offline
Posts: 381
Joined: Jan 08, 2015
Location: Stockholm, SWEDEN

Re: Migrate SQLite to PostGreSQL

:D Ok so it is actually importing all the tables at the moment into the new PostGreSQL database!

Code: Select all
stepCreateImport         4107.7@11:10 FINISHED
stepDoIMPORT             4107.7@11:10 STARTED  this might take some time  (~202 minutes for this step ), # of records to be imported: 28909420
dropPostGresDB 4107.9@11:10 START
dropPostGresDB 4108.2@11:11 FINISHED
createPostGresDB 4108.2@11:11 START
createPostGresDB 4108.7@11:11 FINISHED
importing 4108.7@11:11 ...
doPOSTGREupdate 4108.7@11:11 0/1; importing   bytes:8853805108, file:postgresimportFromCOPY-ALL
According to the first line above it will take approx 202 minutes so I guess I will have to wait a little bit more than 3 hours to see if it succeeds.
I can see in Postico that the database is growing and some of the tables are already 40-70MB in size so it looks like it is working (so far)... 8)

Posted on
Mon Jan 21, 2019 6:47 am
CrazyFin offline
Posts: 381
Joined: Jan 08, 2015
Location: Stockholm, SWEDEN

Re: Migrate SQLite to PostGreSQL

Import into PostGreSQL finished ok:

Code: Select all
stepDoIMPORT             4107.7@11:10 STARTED  this might take some time  (~202 minutes for this step ), # of records to be imported: 28909420
dropPostGresDB 4107.9@11:10 START
dropPostGresDB 4108.2@11:11 FINISHED
createPostGresDB 4108.2@11:11 START
 createPostGresDB 4108.7@11:11 FINISHED
importing 4108.7@11:11 ...
doPOSTGREupdate 4108.7@11:11 0/1; importing   bytes:8853805108, file:postgresimportFromCOPY-ALL
stepDoIMPORT            12888.8@13:37 FINISHED
But when the script started to check for updates in prod db it stopped again on the eventlog_history even though I have set
skipDevices = ["eventlog_history"] :
Code: Select all
stepREFRESHquery        12888.8@13:37 check for updates to SQLite since copy of DB was made
REFRESH12889.1@13:37 0/265 :device_history_1208422578
purgeRecordsWithSameTime12889.3@13:37 for fname dev_1208422578 out of 1 records 0 skipped = 0.0%; datePurged: 0 = 0.0%
REFRESH12889.4@13:37 1/265 :device_history_1533782331
purgeRecordsWithSameTime12889.5@13:37 for fname dev_1533782331 out of 1 records 0 skipped = 0.0%; datePurged: 0 = 0.0%
....
REFRESH12931.4@13:38 109/265 :device_history_306539892
purgeRecordsWithSameTime12931.4@13:38 for fname dev_306539892 out of 1 records 0 skipped = 0.0%; datePurged: 0 = 0.0%
REFRESH12931.5@13:38 110/265 :eventlog_history
Traceback (most recent call last):
  File "sql2postgre.py", line 1425, in <module>
    mainPGM()
  File "sql2postgre.py", line 275, in mainPGM
    SQLITE2POSTGRES(timeStart)
  File "sql2postgre.py", line 622, in SQLITE2POSTGRES
    recCount,recSkipped,purgeCount,x = stripPurgeRecords(x,fname,purgeRecordsWithSameTime,purgeDevice)
ValueError: need more than 3 values to unpack

Posted on
Tue Jan 22, 2019 12:02 pm
CrazyFin offline
Posts: 381
Joined: Jan 08, 2015
Location: Stockholm, SWEDEN

Re: Migrate SQLite to PostGreSQL

I added some more debug logging to the function call in order to try to see what goes wrong when the "stepREFRESHquery" starts

This is how the extra debug lines look on all tables except for the table "eventlog_history":

Code: Select all
**************
Variables sent to stripPurgeRecords
 x = 11445.2@16:44 ['218347;:-;2019-01-22 13:59:55;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;6;:-;;:-;', '218348;:-;2019-01-22 14:00:56;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;5;:-;;:-;', '218349;:-;2019-01-22 16:05:59;:-;;:-;24.6;:-;24.6 \xc2\xb0C / 2%;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;24.6;:-;', '218350;:-;2019-01-22 16:07:00;:-;;:-;24.7;:-;24.7 \xc2\xb0C / 2%;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;24.7;:-;', '218351;:-;2019-01-22 16:09:02;:-;;:-;24.6;:-;24.6 \xc2\xb0C / 2%;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;24.6;:-;', '']
fname = device_history_1782321349
purgeRecordsWithSameTime = 11445.2@16:44 True
purgeDevice =
**************


This is how it looks like when the "stepREFRESHquery" starts on table "eventlog_history" and where it fails:

Code: Select all
**************
Variables sent to stripPurgeRecords
 x = 11475.4@16:45 ['471604;:-;2019-01-22 14:51:37;:-;;:-;;:-;False;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;', '471605;:-;2019-01-22 14:51:37;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;disconnected;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;', '471606;:-;2019-01-22 14:51:37;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;disconnected;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;;:-;',
< and many more data in this loooooong vector for the table "eventlog_history" >
fname =] eventlog_history
purgeRecordsWithSameTime = 11476.1@16:45 True
purgeDevice =
**************


So the ONLY difference is that the vector "x" is muuuuuch longer when it is the eventlog_history table. However, there are other tables that has a long vector for variable x and they work fine. It is ONLY on the table eventlog_history I get this error and the script stops:

Code: Select all
Traceback (most recent call last):
  File "sql2postgre.py", line 1432, in <module>
    mainPGM()
  File "sql2postgre.py", line 276, in mainPGM
    SQLITE2POSTGRES(timeStart)
  File "sql2postgre.py", line 629, in SQLITE2POSTGRES
    recCount,recSkipped,purgeCount,x = stripPurgeRecords(x,fname,purgeRecordsWithSameTime,purgeDevice)
ValueError: need more than 3 values to unpack


It looks like that something goes wrong in the function "stripPurgeRecords" when processing the table "eventlog_history". I have tried to understand what part in that function is causing this error but I am somewhat stuck and I am now thinking of adding a check that if it is the eventlog_history table I will just not enter that function. I have no need of updating the eventlog_table properly in the database.


Here is the code for the function "stripPurgeRecords" and I think the error is caused by variable y not sent back on a specific situation?
Code: Select all
####-----------------  strip / drop records that are not needed form data.   eg 4 entries with the same date/time: only keep last record of the 4. It should have all new parameters  ---------
def stripPurgeRecords(x,fname,purgeRecordsWithSameTime, purgeDevice=""):
   global splitRecordsString
   if len(x)<1: return 0,0,0,[]
   if fname in purgeDevice: # is this one in the list we would like to do purgeDATE on?
      purgeDate =purgeDevice[fname] # get the timestamp
   else:
      purgeDate="9999999999999"
   # init counters ..
   purgeCount =0
   toShow= fname.split("_")
   if len(toShow)<3: return 0,0,x
   y=[]
   recSkipped=0
   recCount=0
   lastLine = x[0]
   fields=lastLine.split(splitRecordsString)
   if len(fields)<2:
      lastDate= ""
   else:
      lastDate= fields[1]
   for line in x:            # loop though "INSERT ... " lines  and check we we wnat to keep them
      recCount+=1
      if recCount==1: continue # skip first line, always empty
      fields=line.split(splitRecordsString)
      if len(fields)<2: continue
      if purgeDate !="9999999999999":         # is datepurge on for this one?
         if fields[1].replace("-","").replace(" ","").replace(":","") < purgeDate:  # first convert to simple string, then compare
            purgeCount+=1
            continue
      if fields[1] != lastDate or not purgeRecordsWithSameTime:
         y.append(lastLine)      # if changed write out the last record
         lastLine=line
         lastDate= fields[1]
      else:                  # if this is the same timestamp as the last one skip, just count the # of skipped records
         recSkipped+=1
   toLog(3,"purgeRecordsWithSameTime", "for fname "+str(toShow[0][:3]+"_"+toShow[2]).rjust(13)+" out of "+str(recCount)+" records "+ str(recSkipped)+" skipped =" +"%4.1f"%(100.*recSkipped/max(1.,recCount))+"%; datePurged: "+str(purgeCount)+" =" +"%4.1f"%(100.*purgeCount/max(1.,recCount))+"%")
   return recCount,recSkipped,purgeCount,y
Last edited by CrazyFin on Tue Jan 22, 2019 12:05 pm, edited 1 time in total.

Posted on
Tue Jan 22, 2019 12:03 pm
kw123 offline
User avatar
Posts: 8333
Joined: May 12, 2013
Location: Dallas, TX

Re: Migrate SQLite to PostGreSQL

It might have problems when there are line breaks in the data.


Sent from my iPhone using Tapatalk

Posted on
Tue Jan 22, 2019 12:46 pm
CrazyFin offline
Posts: 381
Joined: Jan 08, 2015
Location: Stockholm, SWEDEN

Re: Migrate SQLite to PostGreSQL

Look at the line
Code: Select all
if len(toShow)<3: return 0,0,x

That line will only return 3 variables/values and not 4 which is the functions variable list:
Code: Select all
def stripPurgeRecords(x,fname,purgeRecordsWithSameTime, purgeDevice="")
I think this is causing the error!

I will test with changing it to:

Code: Select all
if len(toShow)<3: return 0,0,x,[]

Posted on
Mon Aug 05, 2019 4:30 pm
InsteonDiego offline
Posts: 48
Joined: Dec 07, 2016

Re: Migrate SQLite to PostGreSQL

Update: After stepping through the migrate script, it seems that a change at line 1398 is necessary to return the proper four values for unpacking. However, I think the correct change is 0,0,0,x instead of 0,0,x,[]. The latter produces a type mismatch for me a few steps later, whereas the former allows for the script to complete. Seems to also make sense based on what I can read from variable naming and Karl's intentions. I am currently in the final loading stage of data into the Postgres db . . . so hopefully I am in the home stretch . . .

Hello CrazyFin -

Did you ever find a resolution to this problem? I am working on the migration and getting stopped at the same location you were. Any pointers to resolution are appreciated.

Thanks!

Adam

Posted on
Tue Aug 06, 2019 9:28 am
CrazyFin offline
Posts: 381
Joined: Jan 08, 2015
Location: Stockholm, SWEDEN

Re: Migrate SQLite to PostGreSQL

@adam:

Sorry, I never was able to finalize the migration from SQLite to PostGreSQL.

I tried with several different tools for migrating from SQLite to PostGreSQL but none of the are fully automatic and all of them required quite much manual input and manual changes to get it all right.

I decided to just start from "scratch" with collecting new data from that date and I have saved my old SQLite database if will have time in the future to merge all my old data into my current PostGreSQL database.

Who is online

Users browsing this forum: No registered users and 4 guests