Gheek.net

September 11, 2015

using Perl, create a CSV of data if it is in a list format

Filed under: perl — lancevermilion @ 11:40 am

I needed to convert a list of data from a list of data that was already setup in a key/value format. The problem was the same keys did not exist for each chunk of data that made up the list. For example if I have a sample list of what is below I would expect to get the following.

Sample List

start data chunk abc
  key1 value1
  key2 value2
start data chunk def
  key2 value2
start data chunk ghi
  key1 value1
start data chunk jkl
  key1 value1
  key2 value2

Desired Sample CSV output

start data chunk,abc,def,ghi,jkl
key1,value1,,value1,value1
key2,value2,value2,,value2

I wrote some Perl code to accomplish this. Not pretty and I am sure people will say it can be done better, faster, and with less lines. This works for my purposes. Enjoy the use of it if it helps you.

Here is the code

#!/usr/local/bin/perl
use strict;

my $filename = $ARGV[0];
my @arr = do {
    open my $fh, "<", $filename
        or die "could not open $filename: $!";
    <$fh>;
};

# Unique set of Keys
my $href_keys = {};
# List of Key1/Key2/Values in data provided
my $href_data = {};
# New list of Key/Values as an array
# If Key2 does not exist for Key1 it will be added as a blank values. This allows a proper CSV to be presented.
my $href_csv = {};

my $tmpkey = '';

for my $line (@arr)
{
  chomp($line);
  $line =~ s/\r//g; #Just in case you have CR
  $line =~ s/\n//g; #Just in case you have New Line / LF
  if ( $line =~ /^(start data chunk) (.*)/ )
  {
    $tmpkey = $2;
    $href_keys->{$tmpkey}->{$1} = $2;
    $href_data->{$1} = '';
  }
  if ( $line =~ /^\s+(.*) (value\d.*)/ )
  {
    $href_keys->{$tmpkey}->{$1} = $2;
    $href_data->{$1} = '';
  }
}


for my $href_data_k1 ( sort keys %$href_data )
{
  my $newkey = $href_data_k1;
  $newkey =~ s/(start data chunk)/_$1/g;
  for my $href_keys_k1 ( sort keys %$href_keys )
  {
    my $val = '';
    $val = $href_keys->{$href_keys_k1}->{$href_data_k1} if ( $href_keys->{$href_keys_k1}->{$href_data_k1} );
    push(@{$href_csv->{"$newkey"}}, $val);
  }
}

for my $href_csv_k1 ( sort keys %$href_csv )
{
  my $href_csv_newk1 = $href_csv_k1;
  $href_csv_newk1 =~ s/_//g;
  print "$href_csv_newk1,", join(",", @{$href_csv->{$href_csv_k1}}), "\n";
}
Advertisement

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: