Rev. | Time | Author |
---|---|---|
e77d3859a5ff tip | 2024-04-24 18:39:37 | Lorenzo Isella |
A simple script to get SMEs and broadband statistics from the scoreboard. |
||
651d887893b6 | 2024-04-17 18:12:24 | Lorenzo Isella |
I introduced a cut-off on the years for the railways aid. |
||
8fe9c3115999 | 2024-04-04 00:20:28 | Lorenzo Isella |
I added an option about including an extra column on TCF/TCTF. |
||
215515959083 | 2024-04-03 20:22:01 | Lorenzo Isella |
I now add info about the TCTF cases and I treat separately the cases when I create a single file or I keep the tam separate from Poland. |
||
a162c551051f | 2024-04-03 04:44:42 | Lorenzo Isella |
I added command to compress the csv files. |
||
46c48c865f97 | 2024-04-02 20:23:43 | Lorenzo Isella |
I limit the number of open csv files while writing the data which improved the performance. |
||
fd36b8a17b68 | 2024-04-02 19:42:32 | Lorenzo Isella |
I now save more efficiently the csv files. |
||
5f195e426233 | 2024-04-01 05:31:25 | Lorenzo Isella |
I cleaned up the code and I again went for a multifile solution. |
||
9cdc07c9adaa | 2024-04-01 05:25:07 | Lorenzo Isella |
I went back to a multi-file parquet solution. It uses much less memory. |
||
a77e5856eb89 | 2024-04-01 05:08:49 | Lorenzo Isella |
I now ensure that beneficiary_region is text. |
||
c726473290a6 | 2024-04-01 04:44:05 | Lorenzo Isella |
I now read a compressed csv file and I save the data as multiple files. It is much easier on memory. |
||
eb51e5cb311a | 2024-03-30 04:20:53 | Lorenzo Isella |
New file to handle the latest version of the Polish tam. |
||
e7bfcf3fb593 | 2024-03-29 05:26:07 | Lorenzo Isella |
I improved the fruit repository. |
||
c30b264bd21c | 2024-03-28 07:25:02 | Lorenzo Isella |
I added new entries of my sources.list.d directory. |
||
32fb8fd497e6 | 2024-03-28 07:23:14 | Lorenzo Isella |
I added a file to configure mpv. |
||
e33ee7191ce5 | 2024-03-26 23:37:29 | Lorenzo Isella |
I now save the parquet files with a more expressive name. |
||
1b78ef2e4c2d | 2024-03-26 23:30:39 | Lorenzo Isella |
The solution with write_dataset is better when working with very large files. I write multiple parquet files and I have no ram issues at all. |
||
e8eca9502916 | 2024-03-26 18:52:45 | Lorenzo Isella |
I added a new code showing how to handle a big csv file, giving it an automatic schema and filter the results. |
||
305074685d98 | 2024-03-26 03:58:09 | Lorenzo Isella |
I now ensure that I remove final_spain.csv before I generate it. |
||
c8d552ffbe18 | 2024-03-22 22:45:23 | Lorenzo Isella |
I added another file to carry out some diagnostics on TAM. |
||
f57b0540403c | 2024-03-22 00:51:10 | Lorenzo Isella |
I made a mistake with the saving of a compressed csv file. |
||
0eeeca691bae | 2024-03-21 20:50:49 | Lorenzo Isella |
I modified the obfuscated data set. |
||
4e6f14cc0e4e | 2024-03-21 01:40:57 | Lorenzo Isella |
I save the data directly as a compressed csv file. |
||
81e6b8181ec6 | 2024-03-21 01:31:43 | Lorenzo Isella |
I changed the way in which I handle the date. |
||
036b6d674044 | 2024-03-20 22:26:43 | Lorenzo Isella |
I changed the file names and I now use write_csv_arrow to save the data as a csv. |
||
0a117d28cd27 | 2024-03-20 22:25:50 | Lorenzo Isella |
I now can choose to skip the first part of the data processing once it has been done for all. |
||
8fadfc840e06 | 2024-03-18 22:05:13 | Lorenzo Isella |
I now use write_parquet instead of write_dataset to save the results. I am not interested in a multifile solution. |
||
faae683130e0 | 2024-03-18 22:03:37 | Lorenzo Isella |
I changed the name of the input file. |
||
2cb03c06206d | 2024-03-18 18:46:34 | Lorenzo Isella |
I remove two columns I do not need any longer. |
||
1e5dfb3a027a | 2024-03-18 18:37:26 | Lorenzo Isella |
I modified the code because of modifications in the Slovenian input files. |