The best list online is available at Trove here. It not only lists all newspaper mastheads ever published in Australia but also shows which ones have been digitised by the Trove team, and what years. What’s not so useful about this list is that it doesn’t show the total number of mastheads by date range and location.
I wanted to know what newspapers were in existence in Australia between 1901 and 1939. To do that I copied each of the state lists and the national list (all years) to a spreadsheet then worked with ChatGPT to develop a script that would extract those that were in existence for any period during that range. Of course, I could have used Openrefine, split columns, created date sorting columns etc but I didn’t need a comprehensive list just yet. I just wanted a total number for a single sentence.
Having found that, I thought I’d share!
Total number of newspaper mastheads 1832-2024: 1851
The date range I’m looking at, 1901-1939, featured 1143 in existence at some point. Here’s the rough script I used to get that number:
import pandas as pd
import re
# Function to check if a date range overlaps with 1901-1939
def is_in_date_range(masthead):
if not isinstance(masthead, str):
return False # Skip non-string values
matches = re.findall(r'(\d{4})(?:\s?-\s?(\d{4})?)?', masthead)
for match in matches:
start_year = int(match[0])
end_year = int(match[1]) if match[1] else start_year
if 1901 <= end_year and 1939 >= start_year:
return True
return False
# Function to process the spreadsheet and extract relevant newspapers
def filter_newspapers_by_date_range(input_file, output_file):
try:
# Load the Excel file
data = pd.read_excel(input_file)
# Ensure the relevant column exists
if 'Masthead' not in data.columns:
raise ValueError("The file must contain a 'Masthead' column.")
# Filter rows where dates in the Masthead column overlap with 1901-1939
filtered_data = data[data['Masthead'].apply(is_in_date_range)]
# Save the filtered data to a new Excel file
filtered_data.to_excel(output_file, index=False)
print(f"Filtering complete. Filtered data saved to {output_file}")
except Exception as e:
print(f"An error occurred: {e}")
# Main script
if __name__ == "__main__":
input_file = r"your_input_file_location\aust-newspapers-gazettes-1832-2024.xlsx" # Input file path
output_file = r"your_output_file_location\newspapers.xlsx" # Output file path
filter_newspapers_by_date_range(input_file, output_file)
If you can write in Python you might wish to streamline this code.
Here’s the file I used (sans The Australian Women’s Weekly, which isn’t a newspaper):
So, now you can more easily find out which Australian newspaper mastheads existed in your chosen periods.
NB: This may not be entirely accurate if some newspapers are not listed or if a date has been entered incorrectly.