Flat File Length Management Script

Posted on
Wed Jul 15, 2015 8:40 pm
DaveL17 offline
User avatar
Posts: 6741
Joined: Aug 20, 2013
Location: Chicago, IL, USA

Flat File Length Management Script

I have a Python script that writes select device state information to a flat file (CSV file) for use in constructing Gnuplot charts. Due mainly to laziness, I always just added another line to the end of the file and got on with my life. After a while, the file started to get too large--and for no good reason. For example, I use 3 days' worth of data--288 observations--but my CSV file could quickly grow to many thousands of lines. At any event, I finally decided to whip out a simple script to keep the file to a more rational size. I run it through a schedule that fires at a time where there is no chance that the CSV file will be in use by another process. I only run it once per day. There are five configurable variables--but only two need explaining:

keep_first_lines: When TRUE, the script will retain the first N lines of the data. When FALSE, the script will retain the last N lines of the data.
keep_header_lines: This value controls how many header lines to retain. A setting of 0 retains no header lines. A setting of 1 retains one line...

Feel free to post improvements or criticisms, and use at your own risk.

Code: Select all
import os
import shutil
import sys

################################################################
# User configurable variables.
source = '/Users/.../Dropbox/Public/charts.csv'
backup = '/Users/.../Dropbox/Public/charts backup.csv'

target_lines = 10
keep_first_lines = False
keep_header_lines = 0
################################################################

try:
    # Make a backup of the original file in case something goes wrong.
    shutil.copyfile(source, backup)

    # Open the original file in read-only mode and count the number of lines.
    with open(source, 'r') as oldfile:
        lines = oldfile.readlines()
        old_num_lines = sum(1 for line in lines)

    # Reopen the orignal file in write mode and write-back only the lines we want. If we
    # have fewer lines to start with, move on.
   
    # Keep the last lines of the file.
    if old_num_lines > target_lines and keep_first_lines == False:
        with open(source, 'w') as newfile:
            if keep_header_lines > 0:
                newfile.writelines(lines[ 0 : keep_header_lines ])
            newfile.writelines(lines[ (old_num_lines - target_lines) : old_num_lines ])
           
    # Keep the first lines of the file.
    elif old_num_lines > target_lines and keep_first_lines == True:
        with open(source, 'w') as newfile:
            newfile.writelines(lines[ 0 : (target_lines + keep_header_lines) ])

    else:
        pass

    # Remove the backup file if we have been successful. NOTE: The deleted file *will not*
    # be sent to the Trash folder.
    os.remove(backup)

    indigo.server.log(u"Flat file truncated.")

except:
    indigo.server.log(u"Something has gone wrong.")

I came here to drink milk and kick ass....and I've just finished my milk.

[My Plugins] - [My Forums]

Posted on
Fri Jul 17, 2015 4:54 pm
howartp offline
Posts: 4559
Joined: Jan 09, 2014
Location: West Yorkshire, UK

Re: Flat File Length Management Script

Given CSVs often have a header line, could you add a variable to keep the header line when using retain_last_lines mode?


Sent from my iPhone using Tapatalk

Posted on
Fri Jul 17, 2015 6:13 pm
DaveL17 offline
User avatar
Posts: 6741
Joined: Aug 20, 2013
Location: Chicago, IL, USA

Re: Flat File Length Management Script

howartp wrote:
Given CSVs often have a header line, could you add a variable to keep the header line when using retain_last_lines mode?

That's a good idea. Thanks for suggesting it!
Dave

I came here to drink milk and kick ass....and I've just finished my milk.

[My Plugins] - [My Forums]

Posted on
Sat Jul 18, 2015 5:16 am
DaveL17 offline
User avatar
Posts: 6741
Joined: Aug 20, 2013
Location: Chicago, IL, USA

Re: Flat File Length Management Script

I've updated the script to include @howartp's suggestion. The script should now account for header rows by setting the variable keep_header_rows. For example,

Code: Select all
target_lines = 10
keep_first_lines = False
keep_header_rows = 2

Should retain the first 2 lines of the file (headers), and the last 10 lines of the file (data) -- for a total of 12 lines.

Dave

I came here to drink milk and kick ass....and I've just finished my milk.

[My Plugins] - [My Forums]

Posted on
Sat Jul 18, 2015 5:18 am
howartp offline
Posts: 4559
Joined: Jan 09, 2014
Location: West Yorkshire, UK

Re: Flat File Length Management Script

Excellent - I have no use for it at the moment, but I do CSV/Data manipulation at work regularly which is what prompted the suggestion for the benefit of others.


Sent from my iPhone using Tapatalk

Posted on
Mon Jul 11, 2016 7:53 pm
DaveL17 offline
User avatar
Posts: 6741
Joined: Aug 20, 2013
Location: Chicago, IL, USA

Re: Flat File Length Management Script

Moderator's Note: Moved to the Dave's Scripts forum.

I came here to drink milk and kick ass....and I've just finished my milk.

[My Plugins] - [My Forums]

Page 1 of 1

Who is online

Users browsing this forum: No registered users and 2 guests