filer

When you do a lot of file manipulation tasks which all tend to be repetitive and have a lot of boilerplate code it is always helpful to have most commonly used functions stored somewhere so they can be used when needed. That is exactly what filer represents - a library of various file manipulation function that strive to be highly (re)usable.

The library is divided into namespaces that specialize in one file type, but they can all be used in combination. There are namespaces for manipulating directories (dir.core), text files (txt.core), XML files (xml.core) and HTML files (html.core). Here I would like to focus only on html.core namespace, but if you want to check out the others as well go to the library’s Sourcehut page.

The main purpose of the HTML namespace functions is to perform structure and content assertions upon HTML files. The need for such functions arose during the testing of one mass mailing system. The e-mail messages that were sent to the customers needed to be tested thoroughly by checking that every required feature was at its place. So the best way was to pick that place from the message and assert its contents with the expected value.

The main function, that performs the explained action, is assert-select. It is fully backed by enlive library. It utilizes its selectors for document navigation. The inspiration came from this Ruby on Rails method.

The function takes a HTML document, a selectors vector and a assertion function which should be a predicate. HTML document can be a string, a file or enlive node map. The predicate will be applied on all returned nodes and must accept only one argument which is node map - the map that enlive returns. So in order to query a particular part of the node :tag, :attrs or :content, the appropriate key must be selected.

All the following examples are executed against an HTML test file in the filer repository.

Primarily you can do assertions on multiple nodes.

(html/assert-select test-html [:h1]
	(fn [node] (string? (-> node :content first))))
=> true

:content is a list so we must take the first (and only) element from it. This can be used to perform all kinds of assertions, like check if there are <li> elements without any content.

(html/assert-select test-html [:li]
	(fn [node] (not (empty? (-> node :content)))))
=> true

The output is true so all <li> elements have some content. Assertions can’t only be done on node content, we can query node attributes also. Let’s check if all <a> elements have href attribute with some value.

(html/assert-select test-html [:a]
	(fn [node] (when-let [href (-> node :attrs :href)]
                    (not= "" href))))
=> true

When we want to assert on some node on the specific position like ‘assert on the 5th h1 tag in the document’ we can supply additional options to the assert-select function.

(html/assert-select test-html [:h1]
      (fn [node]
        (= "Paragraphs" (-> node :content first)))
      {:n 4})
=> true

The :n option is currently the only option that this function takes and represents the position of the node in the list that enlive returns, effectively the position in the document. Position numbers start from 0.

If we know the particular node we want to do assertion upon we can fully utilize enlive’s node syntax to access nodes by id and/or by their hierarchical position.

(html/assert-select test-html [:div#top :> :header :> :h1]
      (fn [node]
        (= "HTML5 Test Page" (-> node :content first))))
=> true

CSS syntax is supported so we can just add #top on the node name to select just the node with that id. :> means ‘the following child node’. Read more about selector syntax on the enlive page.

There are a few helper functions for some common cases already defined so we can avoid writing full functions. For example for content equality assertion we can use the content=? function.

(html/assert-select test-html [:div#top :> :header :> :h1]
	(html/content=? "HTML5 Test Page"))
=> true

There is also a function for regex matching. For example lets check if some node’s content has only uppercase letters.

(html/assert-select test-html [:abbr]
	(html/content-matches? #"[A-Z]+")
	{:n 0})
=> true

Let’s take a look at some other helper functions. For asserting on node attributes…

(html/assert-select test-html [:circle]
	(html/attributes-contain? :cx "100"))
=> true

For checking link target…

(html/assert-select test-html
	[:article#embedded__progress :> :footer :> :p :> :a]
	(html/link-target=? "#top"))
=> true

Link name…

(html/assert-select test-html
	[:article#embedded__progress :> :footer :> :p :> :a]
	(html/link-name=? "[Top]"))
=> true

Assertions on img nodes…

(html/assert-select test-html
	[:article#embedded__images :> :div :> :p :> :img]
	(html/img-src=? "http://placekitten.com/480/480"))
=> true

(html/assert-select test-html
	[:article#embedded__images :> :div :> :p :> :img]
	(html/img-src-alt=? "http://placekitten.com/480/480" "Image alt text"))
=> true

And special case for asserting img in a (picture as a link) tag scenario.

(html/assert-select test-html
	[:p#img-as-link :> :a]
	(html/img-as-link=? "http://placekitten.com/480/480" "#top"))
=> true

These are just some of the common cases that this function can handle. It provides enough flexibility to support all kind of assertions. For more details check out the source code at the repo page. I hope you find the library useful.