User:ZeLonewolf/Searching mailing lists

From OpenStreetMap Wiki
Jump to navigation Jump to search

Searching OSM mailing lists

The OSM mailing lists are a traditional forum for discussion. Archives are available (for example, the archive for the tagging list). However, there is no reliable/easy way to search through the various mailing lists in order to determine whether something has been discussed before.

To solve this problem, the bash script below will recursively download the full mailing list archive into a directory structure of text files so that they can be searched with standard Unix tools such as grep.

#!/bin/bash
#Inspired by https://github.com/matkoniecz/mailing-list-downloader/blob/master/fetch-files.rb by Mateusz Konieczny

#Date setup
endmonth=`/bin/date +%Y-%B`

#Grab mailing lists
curl https://lists.openstreetmap.org/listinfo > lists.html
grep -oP 'listinfo/\K([\w-]+)' lists.html > lists.txt

while read list; do

  currentdate="2004-01-01"
  month=""
  mkdir -p "lists/$list"

  #Loop through each list/month combo and download
  until [ "$month" == "$endmonth" ]
  do
    currentdate=$(/bin/date --date "$currentdate 1 month" +%Y-%m-%d)
    month=$(/bin/date -d $currentdate +%Y-%B)
    curl -f "https://lists.openstreetmap.org/pipermail/$list/$month.txt.gz" > "lists/$list/$month.txt.gz"
  done

done <lists.txt

#Remove months with no mailing list activity
find lists -type f -size 0 -delete

#Unzip
gunzip -r lists

#Cleanup
rm -f lists.txt
rm -f lists.html