Skip to content

first pass for AP-616#4

Open
jason-raitz wants to merge 5 commits intomainfrom
AP-616_add_search-to-file_method
Open

first pass for AP-616#4
jason-raitz wants to merge 5 commits intomainfrom
AP-616_add_search-to-file_method

Conversation

@jason-raitz
Copy link
Contributor

  • removed unused client.search()
  • added client.write_search_results_to_file()
  • added method to iterate through search xml results
  • added some xml fixtures for a first and last result for a sample search as well as the expected output xml for said search.

 - removed unused client.search()
 - added client.write_search_results_to_file()
 - added method to iterate through search xml results
 - added some xml fixtures for a first and last result for a sample
   search as well as the expected output xml for said search.
@jason-raitz jason-raitz self-assigned this Mar 18, 2026
 - commenting out for now to use as template for new tests
@awilfox
Copy link
Member

awilfox commented Mar 18, 2026

Since the eventual goal was to migrate Willa to this as well, I don't think search should be removed.

Copy link
Member

@anarchivist anarchivist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally looks good. what else is left other than the tests?

Comment on lines +19 to +24
NS = "http://www.loc.gov/MARC21/slim"
E.register_namespace("", NS)

# remove namespace that ElementTree adds to records when passed
_NS_DECL: str = f' xmlns="{NS}"'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we stripping the namespace? does it lead to redundancy if we don't?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we register the namespace, the library will add that to every record. We could just not register it and manually enter it for the collection.

 - and some sensible guard statements
 - checked for some edge cases
 - lengthened max-line-length defaults to be a little friendlier
@jason-raitz jason-raitz marked this pull request as ready for review March 23, 2026 20:56
@jason-raitz
Copy link
Contributor Author

Questions

  • Do we want a standard max-line-length for python?
  • Do we want to formalize which python tools to use across our various python projects?
    (pylint, flake8, mypy, pydoc, uv, linting standards)
  • For client.write_search_results_to_file(), do we want to return '0' or an error when a Tind response is
    successful, but has no matches? Currently it writes nothing to a file and returns '0'.

Copy link
Member

@awilfox awilfox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking really good, but we should find answers to your questions before merging IMO.


return recs

def write_search_results_to_file(self, query: str = "", output_file_name: str = "tind.xml") -> int:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only non-test line that is above 100 characters; renaming output_file_name to output_file would be enough to bring it under the limit, if we don't want to raise it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd also wager that we can split it into lines which would be sufficient.

@@ -1,4 +1,4 @@
[flake8]
max-line-length = 100
max-line-length = 120
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not convinced, will elaborate in overall comment.

@awilfox
Copy link
Member

awilfox commented Mar 23, 2026

Questions

  • Do we want a standard max-line-length for python?

There's a lot of debate in the Python community over this. PEP 8 says 79 characters. I don't think modern Python can be effectively written with a 79 character limit. PyCharm and the Google style guide use 120. I think this is too long, and indeed, with my present eyesight my font size is too large to allow 120 characters to fit on the laptop screen. I think 100 is a fair compromise and is what we used in Willa.

  • Do we want to formalize which python tools to use across our various python projects?
    (pylint, flake8, mypy, pydoc, uv, linting standards)

This would be a great discussion to have at sprint planning.

  • For client.write_search_results_to_file(), do we want to return '0' or an error when a Tind response is
    successful, but has no matches? Currently it writes nothing to a file and returns '0'.

I'm firmly on the side of 0 results not being an error or an exceptional condition.

Comment on lines +194 to +200
def test_write_search_results_to_file_with_malformed_output_filename(
client: TINDClient,
malformed_filename: str = " .csv",
) -> None:
"""write_search_results_to_file raises ValueError for a malformed output filename."""
with pytest.raises(ValueError, match="output_file_name"):
client.write_search_results_to_file("", output_file_name = malformed_filename)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are we determining a malformed filename? i don't remember this being defined in the ticket.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not. just an edge case I threw in as a possibility. I could see a case where someone accidentally tries to give csv extension for the xml file.


return recs

def write_search_results_to_file(self, query: str = "", output_file_name: str = "tind.xml") -> int:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd also wager that we can split it into lines which would be sufficient.

Comment on lines +200 to +208
with open(output_path, "w", encoding="utf-8") as f:
f.write(f'<?xml version="1.0" encoding="UTF-8"?>\n<collection xmlns="{NS}">\n')
for record in self._iter_xml_records(query):
record_xml = E.tostring(record, encoding="unicode")
f.write(record_xml.replace(_NS_DECL, ""))
f.write("\n")
recs_written += 1

f.write("</collection>\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm trying to reconcile what's happening here with what's happening in the test - if i'm understanding this code correctly, we'd be writing the XML declaration with an empty <collection xmlns="http://www.loc.gov/MARC21/slim"></collection> tag pair. is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! Should I change it to delete the file if no results are written?

Comment on lines +64 to +65
[tool.pylint.format]
max-line-length = 120
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not convinced (just like @awilfox; just making sure we're catching it here, too)

Co-authored-by: Anna Wilcox <AWilcox@Wilcox-Tech.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants