Skip to content

rmcv/dc2015

Repository files navigation

dc2015

Some Clojure scripts to digest Hong Kong District Council Election.

Overview

Elected Ratio Analysis

./Elected%20Ratio%20Pan%20Democracy.jpg ./Elected%20Ratio%20Pan%20Establishment.jpg ./Elected%20Ratio%20Others.jpg

Processing

Step 1

Create some utility functions to simplify parsing html table

(def parse-slurp (comp parse slurp))

(defn- read-hdrs [tbl data]
  (->> (extract-from data tbl [:x] "tr th" text)
       first vals first))

(defn- read-data [url]
  (let [d    (parse-slurp url)
        hdrs (read-hdrs "table" d)]
    (->> (extract-from d "table tr" [:x] "td" text)
         (map :x)
         (remove nil?)
         (map #(zipmap hdrs %)))))

Step 2 - results

Ungroup the result table (i.e. split the merged cell)

(->> (extract-from data "table.contents2 tr" [:x :y] "td[rowspan]" text "td" text)
     (remove #(every? nil? (vals %)))
     (reduce (fn [[last res] {:keys [x y]}]
               (if (nil? x)
                 [last (conj res (concat last y))]
                 [x    (conj res y)]))
             [nil []])
     last
     (map #(zipmap hdrs %))
     to-dataset)

Step 3 - noms

Use the utility functions created in Step 1 to parse the master nomination table

(->> "http://www.elections.gov.hk/dc2015/pdf/2015_DCE_Valid_Nominations_C.html"
     read-data
     to-dataset)

Step 4 - noms2

Collect all other nomination data

(->> (extract-from (parse-slurp "http://www.elections.gov.hk/dc2015/chi/nominat2.html")
                   "table tr td"
                   [:x] "a" (attr :href))
     (map :x)
     (remove nil?)
     (apply concat)
     (filter #(and (string? %) (re-matches #"\.\./pdf/nomination.*html" %)))
     (map #(->> % (drop 2) (apply str) (str "http://www.elections.gov.hk/dc2015")))
     (mapcat read-data)
     to-dataset)

Step 5 - mm

Join the tables created in Step 3 - noms and Step 4 - noms2

($join [["選區代號" "獲提名人士姓名 (姓氏先行)"] ["選區號碼" "姓名"]] noms2 noms)

Step 6

Final step - join the nomination data with election result and output to Excel

(-> ($join [["選區號碼" "候選人編號"] ["Constituency Code" "Candidate Number"]]
           mm
           results)
    (save-xls "output.xls"))

License

Copyright © 2015 RMCV

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.

About

Hong Kong District Council Election 2015

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published