Pandas is a great library for handling big datasets but we will use only a small part of it to read our data file. Similarly, the other scientific computing favorite, “numpy” is usually imported as “np” and you do it exactly what you did with pandas: import numpy as np This imports the module pandas and all of the useful functions inside of it can be used using the “pd.” prefix. To import this module you need to type in the following: import pandas as pd This is a module that comes installed with Anaconda. From my tests on time taken to read really large data files and for versatility (as you will see in the bonus tips), I have now settled on using pandas to read my files. Once you have the exported txt or csv, there’s a couple of ways to proceed.
If you are dealing with SPE or other such difficult file formats, I would suggest using a file conversion software that usually comes with the equipment to export the binary file to txt or csv format. In this post I’m going to deal with simpler ASCII (text, CSV files etc) files only.
There is a way to open these using Python, but you need to have a detailed information about the format and construct the code accordingly to read them properly. These will give you garbage if you try to open these with notepad or excel. SPE formats (for Princeton instruments cameras) is a binary format. Other proprietary formats such as the ones that directly come out of our spectrometer, like. Now, note that ASCII files like these are easier to handle for us starters and should show good numbers when opened using notepad or Microsoft excel. If you would like to use the same data file I am using, you can download it from here. The data file, of a near-infrared spectrum around 900 nm, if opened in a text editor, would look as follows. That is two columns of data – Wavelength is the first column, in nanometers and Intensity is the second column (photon counts, let’s say). The intensity for each color is recorded using a camera. The data in this case is formed by spatially dispersing an input light into its constituent colors (wavelengths of that color). For the uninitiated, a spectrometer is basically a fancy prism with a camera at the rainbow end to take a black and white picture (intensity) of the rainbow. In my lab we use a spectrometer to collect data. I am breaking down the data that I’m going to work with because the things I’m going to talk in this post can be applied to any other data which looks similar – That is, a simple two column data, which when plotted will form a 2D line plot with an x and y-axis. If you do not, then I would first suggest putting a few minutes aside for installing Anaconda and taking a crash course in Jupyter.
I also assume that you have Anaconda installed, or know how to install packages into Python.
For this tutorial I am going to assume that you have some idea about using either Jupyter notebook or Python in general.