sciris.sc_dataframe

Simple alternative to the Pandas DataFrame.

This class is rarely used and not well maintained; in most cases, it is probably better to just use the Pandas one.

Classes

dataframe

A simple data frame, based on simple lists, for simply storing simple data.

class dataframe(cols=None, data=None, nrows=None)[source]

A simple data frame, based on simple lists, for simply storing simple data. Much less feature-rich than a Pandas data frame, but simpler to use. Note: this class is semi-deprecated; use at your own risk. To be honest, Pandas is a much better solution better most of the time.

Example:

a = sc.dataframe(cols=['x','y'],data=[[1238,2],[384,5],[666,7]]) # Create data frame
a['x'] # Print out a column
a[0] # Print out a row
a['x',0] # Print out an element
a[0] = [123,6]; print(a) # Set values for a whole row
a['y'] = [8,5,0]; print(a) # Set values for a whole column
a['z'] = [14,14,14]; print(a) # Add new column
a.addcol('z', [14,14,14]); print(a) # Alternate way to add new column
a.rmcol('z'); print(a) # Remove a column
a.pop(1); print(a) # Remove a row
a.append([555,2,14]); print(a) # Append a new row
a.insert(1,[555,2,14]); print(a) # Insert a new row
a.sort(); print(a) # Sort by the first column
a.sort('y'); print(a) # Sort by the second column
a.addrow([555,2,14]); print(a) # Replace the previous row and sort
a.getrow(1) # Return the row starting with value '1'
a.rmrow(); print(a) # Remove last row
a.rmrow(1238); print(a) # Remove the row starting with element '3'

The dataframe can be used for both numeric and non-numeric data.

Version: 2020nov29

make(cols=None, data=None, nrows=None)[source]

Creates a dataframe from the supplied input data.

Usage examples:

df = sc.dataframe()
df = sc.dataframe(['a','b','c'])
df = sc.dataframe(['a','b','c'], nrows=2)
df = sc.dataframe([['a','b','c'],[1,2,3],[4,5,6]])
df = sc.dataframe(['a','b','c'], [[1,2,3],[4,5,6]])
df = sc.dataframe(cols=['a','b','c'], data=[[1,2,3],[4,5,6]])
get(cols=None, rows=None, asarray=True, cast=True)[source]

More complicated way of getting data from a dataframe.

Example:

df = dataframe(cols=['x','y','z'],data=[[1238,2,-1],[384,5,-2],[666,7,-3]]) # Create data frame
df.get(cols=['x','z'], rows=[0,2])
pop(key, returnval=True)[source]

Remove a row from the data frame

append(value)[source]

Add a row to the end of the data frame

property ncols

Get the number of columns in the data frame

property nrows

Get the number of rows in the data frame

property shape

Equivalent to the shape of the data array, minus the headers

addcol(key=None, value=None)[source]

Add a new column to the data frame – for consistency only

rmcol(key, die=True)[source]

Remove a column or columns from the data frame

addrow(value=None, overwrite=True, col=None, reverse=False)[source]

Like append, but removes duplicates in the first column and resorts

rmrow(key=None, col=None, returnval=False, die=True)[source]

Like pop, but removes by matching the first column instead of the index – WARNING, messy

rmrows(indices=None, copy=None)[source]

Remove rows by index – WARNING, messy

replace(col=None, old=None, new=None)[source]

Replace all of one value in a column with a new value

findrow(key=None, col=None, default=None, closest=False, die=False, asdict=False)[source]

Return a row by searching for a matching value.

Parameters
  • key – the value to look for

  • col – the column to look for this value in

  • default – the value to return if key is not found (overrides die)

  • closest – whether or not to return the closest row (overrides default and die)

  • die – whether to raise an exception if the value is not found

  • asdict – whether to return results as dict rather than list

Example:

df = dataframe(cols=['year','val'],data=[[2016,0.3],[2017,0.5]])
df.findrow(2016) # returns array([2016, 0.3], dtype=object)
df.findrow(2013) # returns None, or exception if die is True
df.findrow(2013, closest=True) # returns array([2016, 0.3], dtype=object)
df.findrow(2016, asdict=True) # returns {'year':2016, 'val':0.3}
findrows(key=None, col=None, asarray=False)[source]

A method like get() or indexing, but returns a dataframe by default – WARNING, redundant?

rowindex(key=None, col=None)[source]

Return the indices of all rows matching the given key in a given column.

filter_in(key=None, col=None, verbose=False, copy=None)[source]

Keep only rows matching a criterion (in place)

filter_out(key=None, col=None, verbose=False, copy=None)[source]

Remove rows matching a criterion (in place)

filtercols(cols=None, die=True, copy=None)[source]

Filter columns keeping only those specified

insert(row=0, value=None)[source]

Insert a row at the specified location

sort(col=None, reverse=False)[source]

Sort the data frame by the specified column(s)

sortcols(sortorder=None, reverse=False)[source]

Like sort, but rows by column instead

jsonify(cols=None, rows=None, header=None, die=True)[source]

Export the dataframe to a JSON-compatible format

pandas(df=None)[source]

Function to export to pandas (if no argument) or import from pandas (with an argument)

export(filename=None, cols=None, close=True)[source]

Export to Excel