Finnian's blog

Software Engineer based in New Zealand

2-Minute Read

During the YRS Festival of Code 2015, “SubjectRefresh” and I created a revision app called Refresh. It’s built using Node.js and works by scraping the exam board website (currently only CIE) for the PDF for the syllabus the user has requested. The PDF is then converted to HTML using a PDF to HTML converter and is then shunted through Node’s Cheerio library. We then find out where the relevant information in the HTML is and send that off to TextRazor.

We then use the information about the text that TextRazor gives us to construct questions to ask the user. These are in gap fill format because keywords are removed. The reason we did this is because the answers are already in the text, meaning we didn’t need to construct an algorithm to find out if the user actually gave a correct response to a text based question, which could have dynamic answers. It also allows us to avoid multiple guess questions, which can be easy to complete by using elimination or the answers can be memorised.

I wrote a large amount of the scraping parts and also built the first revision of the front end. I also worked on algorithms for finding where the relevant data in the HTML was located.

In the future, we would like to expand it to not only use all the different exam boards (Edexel, OCR, AQA etc.) but also support other thing like A levels and driving lesson theory too.

Recent Posts