Comparison of noaa-stations
with module-noaa-stations
¶
This document describes differences between an implementation of similar functionality within a stand-alone command line tool written in Python and a Zimagi module.
Code required¶
This snapshot is NOAA-stations commits b910e2e and 76a0dae with commit 58d87e6 of module-noaa-stations. Lines of code used and features supported may change in future revisions.
module-noaa-stations % cloc . # 58d87e6
17 text files.
17 unique files.
9 files ignored.
github.com/AlDanial/cloc v 1.86 T=0.02 s (760.2 files/s, 36315.6 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
YAML 8 69 66 272
Python 4 26 11 110 (33)
Markdown 1 23 0 44
-------------------------------------------------------------------------------
SUM: 13 118 77 426
-------------------------------------------------------------------------------
NOAA-Stations % cloc . # b910e2e1
7 text files.
7 unique files.
3 files ignored.
github.com/AlDanial/cloc v 1.86 T=0.01 s (789.9 files/s, 30489.7 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Python 2 12 34 85
Markdown 2 21 0 31
YAML 1 0 0 10
-------------------------------------------------------------------------------
SUM: 5 33 34 126
-------------------------------------------------------------------------------
Command-line tool (normalized tables) [a] [b]:
NOAA-Stations % cloc . # 76a0daee
7 text files.
7 unique files.
3 files ignored.
github.com/AlDanial/cloc v 1.86 T=0.01 s (680.8 files/s, 43300.6 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Markdown 2 46 0 107
Python 2 16 44 95
YAML 1 0 0 10
-------------------------------------------------------------------------------
SUM: 5 62 44 212
-------------------------------------------------------------------------------
- a(1,2)
The small YAML file in NOAA-Stations is a conda environment configuration file. It is only indirectly related to the tool itself, but providing the necessary dependencies is reasonable to consider part of tool requirements. A pip
requirements.txt
would be similar length.- b(1,2,3)
The Markdown files in both repositories are entirely documentation and are not directly related to the functionality of either. In the main, the documentation is this file itself.
- c
Three of the four Python files in module-noaa-stations are auto-generated. The only file written by hand contains 33 code lines (and some comments).
Feature comparisons¶
Feature description |
Zimagi module |
Command-line |
CL Normalized |
---|---|---|---|
Exposes all source data columns |
No [f] |
Yes |
Yes |
Download by year range |
Partial [d] |
Yes |
Yes |
Download by station list |
Partial [d] |
Yes |
Yes |
Download of all stations |
No [e] |
Yes |
Yes |
Flexible querying of local DB |
Yes |
Yes |
Yes |
RESTful API to access local DB |
Yes |
No |
Yes |
Missing data cleaned |
Partial |
Yes |
Yes |
Performs good normalization |
Yes |
No [g] |
Yes |
Provisions for cloud deployment |
Yes |
No [h] |
No [h] |
Supports “pretty” output |
Yes |
Yes |
Yes |
Supports CSV export |
Yes |
Yes |
Yes |
Supports TSV export |
No |
Yes |
Yes |
Supports JSON export |
Yes |
Yes |
Yes |
“Code” lines [i] |
305 |
95 |
105 |
- d(1,2)
Only a
test
import subcommand defined currently, but data model supports parameters for min year, max year, and station list.- e
Logic for obtaining station list within year currently stubbed out but should follow identical logic to that used in command-line tool.
- f
Data definitions could be created for columns not currently utilized. My estimate is that it would require about 150 additional lines of YAML and maybe 20 lines of Python.
- g
The initial command-line tool simply used the same table structure as the source CSV files. Adding a child table with foreign key would require about 6 extra lines of Python, and 8 extra lines of SQL (which is currently defined as a Python string rather than separate file).
- h(1,2)
No current code knows about any clouds, but the code that would need to be distributed to one is very minimal.
- i
YAML or Python code that is functionally required for the system to operate. Documentation in Markdown or other formats is very desirable to have, but does not change functionality. Auto-generated Python code is excluded.