Conversation
jason-raitz
commented
Mar 18, 2026
- removed unused client.search()
- added client.write_search_results_to_file()
- added method to iterate through search xml results
- added some xml fixtures for a first and last result for a sample search as well as the expected output xml for said search.
- removed unused client.search() - added client.write_search_results_to_file() - added method to iterate through search xml results - added some xml fixtures for a first and last result for a sample search as well as the expected output xml for said search.
- commenting out for now to use as template for new tests
|
Since the eventual goal was to migrate Willa to this as well, I don't think |
anarchivist
left a comment
There was a problem hiding this comment.
generally looks good. what else is left other than the tests?
| NS = "http://www.loc.gov/MARC21/slim" | ||
| E.register_namespace("", NS) | ||
|
|
||
| # remove namespace that ElementTree adds to records when passed | ||
| _NS_DECL: str = f' xmlns="{NS}"' | ||
|
|
There was a problem hiding this comment.
why are we stripping the namespace? does it lead to redundancy if we don't?
There was a problem hiding this comment.
If we register the namespace, the library will add that to every record. We could just not register it and manually enter it for the collection.
- and some sensible guard statements - checked for some edge cases - lengthened max-line-length defaults to be a little friendlier
Questions
|
awilfox
left a comment
There was a problem hiding this comment.
Looking really good, but we should find answers to your questions before merging IMO.
|
|
||
| return recs | ||
|
|
||
| def write_search_results_to_file(self, query: str = "", output_file_name: str = "tind.xml") -> int: |
There was a problem hiding this comment.
This is the only non-test line that is above 100 characters; renaming output_file_name to output_file would be enough to bring it under the limit, if we don't want to raise it.
There was a problem hiding this comment.
i'd also wager that we can split it into lines which would be sufficient.
| @@ -1,4 +1,4 @@ | |||
| [flake8] | |||
| max-line-length = 100 | |||
| max-line-length = 120 | |||
There was a problem hiding this comment.
not convinced, will elaborate in overall comment.
There's a lot of debate in the Python community over this. PEP 8 says 79 characters. I don't think modern Python can be effectively written with a 79 character limit. PyCharm and the Google style guide use 120. I think this is too long, and indeed, with my present eyesight my font size is too large to allow 120 characters to fit on the laptop screen. I think 100 is a fair compromise and is what we used in Willa.
This would be a great discussion to have at sprint planning.
I'm firmly on the side of 0 results not being an error or an exceptional condition. |
| def test_write_search_results_to_file_with_malformed_output_filename( | ||
| client: TINDClient, | ||
| malformed_filename: str = " .csv", | ||
| ) -> None: | ||
| """write_search_results_to_file raises ValueError for a malformed output filename.""" | ||
| with pytest.raises(ValueError, match="output_file_name"): | ||
| client.write_search_results_to_file("", output_file_name = malformed_filename) |
There was a problem hiding this comment.
how are we determining a malformed filename? i don't remember this being defined in the ticket.
There was a problem hiding this comment.
it's not. just an edge case I threw in as a possibility. I could see a case where someone accidentally tries to give csv extension for the xml file.
|
|
||
| return recs | ||
|
|
||
| def write_search_results_to_file(self, query: str = "", output_file_name: str = "tind.xml") -> int: |
There was a problem hiding this comment.
i'd also wager that we can split it into lines which would be sufficient.
| with open(output_path, "w", encoding="utf-8") as f: | ||
| f.write(f'<?xml version="1.0" encoding="UTF-8"?>\n<collection xmlns="{NS}">\n') | ||
| for record in self._iter_xml_records(query): | ||
| record_xml = E.tostring(record, encoding="unicode") | ||
| f.write(record_xml.replace(_NS_DECL, "")) | ||
| f.write("\n") | ||
| recs_written += 1 | ||
|
|
||
| f.write("</collection>\n") |
There was a problem hiding this comment.
i'm trying to reconcile what's happening here with what's happening in the test - if i'm understanding this code correctly, we'd be writing the XML declaration with an empty <collection xmlns="http://www.loc.gov/MARC21/slim"></collection> tag pair. is that correct?
There was a problem hiding this comment.
good catch! Should I change it to delete the file if no results are written?
| [tool.pylint.format] | ||
| max-line-length = 120 |
There was a problem hiding this comment.
not convinced (just like @awilfox; just making sure we're catching it here, too)
Co-authored-by: Anna Wilcox <AWilcox@Wilcox-Tech.com>