Glen Pitt-Pladdy :: BlogIMDB ratings for MythTV | |||
I lead a busy life and rarely have time to waste. I've been using MythTV for a few years now and it allows me to arrange TV in a way that it fits with my life - I can time shift programs, I can watch informative programs at 1.5X speed to get the interesting bits while avoiding wasting my time with all the dressing and drama added to them, I can take my TV with me over net network, I can pause live TV, I can set-up and plan my viewing while I am away from home anywhere I can get an internet connection and much more. What is good to watch?Currently the program guide has around ten thousand programs in it for the next 7-8 days. Almost all of that is of no interest to me at all. All the channels are trying to make their programs look good, and sorting through the chaff is difficult. For movies there is one advantage - IMDB is a massive database of movies, reviews and viewer ratings plus trivia and much more. Being currently a place for the most dedicated movie fans, I find their viewer ratings very reliable: I enjoy almost everything getting an rating of 8 or more. IMDB also publishes a freely downloadable dump of their database which makes it practical for easily importing into MythTV. IMDB to MythTVThere are many other scripts around for doing this. After playing with many of them, they all had deficiencies that made them impractical for me. One of the biggest failings is that there seems to be little consistency in the data in the program guide on DVB. Some channels categorise loads of things as movies which are not (eg. Home improvement or cooking programs), some channels set the release date as 2000 irrespective of what it actually is, other titles are full of typos, some times the subtitle is included in the title, sometimes ...... I could go on for ages. The data in the program guide is very inconsistent. If anything was going to be practical, then it needed to be able to come with all the inconsistencies, and have manual overrides when things went beyond what could be automated. What I did was to load the database, all the aliases, apply various regular expressions to improve consistency (eg. sometimes "one" is written other times just the number is given) and then attempt different matching strategies, eventually falling through to Levenshtein edit distance which helps where there are typos. The script is a bit rough-and-ready, but I have decided to make it available anyway. Download: Perl script to add IMDB ratings to MythTV movies You will also need to download the IMDB dumps from one of their FTP sites. Specifically you need the Ratings, AKA titles and ISO AKA titles dumps. There are a bunch of variables to set for the location of various files in the script. You also need to ensure that you have the Text::Levenshtein Perl module. I have a cron job that runs this each morning after the standard MythTV cron job so ratings get added immediately after the program guide is updated. Other ideasWe have created effective ways of filtering SPAM in email. This often relies on learning filters based on Bayesian inference. The thing I am curious about is if I could use these to learn my viewing habits and train a Bayesian filter to rate upcoming programs based on past knowledge. I am currently experimenting with this using dbacl as the classifier, and at this stage am waiting for enough data to train the filter properly. I may also give NLTK a try at a later date. |
|||
Disclaimer: This is a load of random thoughts, ideas and other nonsense and is not intended to be taken seriously. I have no idea what I am doing with most of this so if you are stupid and naive enough to believe any of it, it is your own fault and you can live with the consequences. More importantly this blog may contain substances such as humor which have not yet been approved for human (or machine) consumption and could seriously damage your health if taken seriously. If you still feel the need to litigate (or whatever other legal nonsense people have dreamed up now), then please address all complaints and other stupidity to yourself as you clearly "don't get it".
Copyright Glen Pitt-Pladdy 2008-2023
|
Comments: