Resources for Pandas

Python library to deal with data frames


We implement our own command in Python and distribute it via pip with pip install sheet2graph.

The command takes spreadsheets files (csv, xlsx) as input, and generates images (png, jpg, svg) based on the data contained.


We create a Virtual environment and setup the project.

• Organizing commands with a Makefile
• Creating our Virtualenv
• Installing packages
• Using requirements.txt
• git branching
We save out first graph from the spreadsheet data.

• Installing Plotly to generate graphs
• Pip freeze requirements
• Solving dependency problems
• Using Pandas to do simple data processing
We will add support for Excel files (.xlsx).

• add dependencies: Openpyxl and Xlrd for Excel support
• Fix type errors
In addition to a local file (.csv, .xlsx), we accept a Google Drive public document as input.

• Parsing the url address of the Google Drive document
• Adding the input option transparently for the user
After using hardcoded column names, we will start making the spreadsheet processing generic. This way our command will work with any spreadsheet file. The first step is to print the data of our input file, so the user can preview it.

• Print-only option to print the input file provided by the user
• Transforming the data with Pandas to index it by letter and 1-based integer (as in spreadsheet applications like Excel)
• Adding tests for the new indexing by letters and integer for columns and rows
For each axis, we will allow the user to use expressions like 'b4,b5,b6,b7' or 'B4:B7' to select cells or ranges to graph.

• Add options '-x' and '-y' to select the data to be graphed
• Making the expressions case-insensitive
• Implementing a comma separated selection option
• Implementing a range selection option
• Adding tests first and making them pass after implementation, as in Test-Driven-Development (TDD)
• Better and more informative user messages in case of error
• Verifying and fixing problems in our Pandas implementation
We will check everything is working so far and add extra options to set the axis labels to a custom user-defined value.

• Debugging broken tests and making all tests pass after all our changes
• Adding an x label and y label options to our command, to specify the labels in the horizontal and vertical axis
• Debugging column types in pandas
• Using exceptions to deal with unreliable cases

