Analyzing movie rating data from an IMDB.com dataset using Python, Pandas and Matplotlib

    Since the dawn of cinema, the quality and enjoyment produced by motion pictures has been a complicated and controversial subject. An entire sub-industry has been created to review, criticize, recommend, analyze, categorize and rate movies. This, added to the subjective nature of each individual likes and dislikes has resulted in mixed experiences and expectations for the public. Some movies that are regarded as timeless classics by some people are seen as boring or even as bad movies by other people. The passing of time and the recent heavy use of special effects and CGI in movies also affect how the movies will be regarded in a few years, when those special effects look outdated.

    However, we may find a general trend of increased satisfaction or dissatisfaction if we analyze a large number of movie ratings.

    The code I wrote for this analysis is available in my GitHub repository.

Research question

    Since many critics refer to their favorite period as the best era that cinema has to offer (or alternatively, that the movie quality is in decline), we will attempt to answer the following question:

Has the perceived quality of movies increased or decreased over time?

    For any answer we may find, we will demonstrate and provide a reason behind it in the form of data and its visualizations.

Findings

    Using an IMDB.com dataset, I analyzed 45,844 movies and 26,024,290 ratings for said movies. The oldest movie in the dataset was launched in 1874 and the newest in 2017.

    I grouped the movies by launch year and calculated the average rating for the movies launched in every year. Doing this, I wanted to get an idea of the overall quality of the movies trough time.

    While the technical aspect of motion pictures have obviously advanced thanks to new technology, it was not clear if this also improved the overall quality of the movie. In the next image I present a chart of the relation between launch year and average rating for the movies launched in each year.

Fig. 1 - Movie average rating per year.

Fig. 1 – Movie average rating per year.

    Some interesting facts I got from this analysis:

  • The initial period (from 1874 to around 1915) is chaotic and experimental. A film could be under a minute long. There was little to no cinematic technique, the film was usually black and white and it was without sound.
  • In the 1920s, begins a process of normalization. Movies are more popular and attainable. The public seem to have learned what to expect from directors and actors by this time. The primary steps in the commercialization of sound cinema were taken in the mid-to late 1920s, this could have helped to this normalization.
  • 2014 was a relatively disastrous year for cinema. The average rating for this year is 2.95. (The lowest since 1917, which is still part of the “experimental period”). The causes for this are beyond this analysis, but I will remind the reader that in 2014 we got movies like Transformers: Age of extinction and Left behind which currently has a 1% score in rottentomatoes.com
  • In an scale from 0 to 5, The average tends to be slightly above 3. There is no noticeable increment or decrement from this average in the last century. So, to answer the research question, we cannot said that the quality of cinema has increased nor decreased substantially.

References

  1. History of film. (2017, November 26). In Wikipedia, The Free Encyclopedia. Retrieved 04:03, November 29, 2017, from https://en.wikipedia.org/w/index.php?title=History_of_film&oldid=812220038
  2. Sound film. (2017, November 28). In Wikipedia, The Free Encyclopedia. Retrieved 04:04, November 29, 2017, from https://en.wikipedia.org/w/index.php?title=Sound_film&oldid=812587250
Tagged with: , , , , ,

Probando Microsoft SQL Server en Linux

    Hace años, nunca hubiera pensado escribir un título como ese, pero las cosas cambian y Microsoft ahora está prestando atención a otros ambientes que no son los suyos. Esto nos ha dado la oportunidad de probar algunas de sus herramientas sin tener que vernos obligados a instalar Windows, lo cual se agradece mucho.

    Voy a instalar SQL Server en Linux (específicamente en Linux Mint). Aquí voy a detallar el proceso de instsalación a modo de tutorial y también voy a ejecutar algunas consultas sencillas para demostrar el uso de SQL Server.

    Durante el proceso de instalación, necesitaremos permisos de administrador, así que nos cambiaremos a nuestra cuenta root:

    Creamos un directorio destinado para herramientas y notas sobre SQL Server y nos cambiamos a él (este paso es totalmente opcional):

    Bajamos e instalamos las llaves de los repositorios de Microsoft:

    Añadimos el repositorio de Microsoft:

    Actualizamos nuestra lista de repositorios:

    Finalmente, instalamos SQL Server:

    Ejecutamos la herramienta de configuración:

    Verificamos que el servicio de SQL Server está ejecutándose correctamente:

    Ahora instalaremos las herramientas para la línea de comandos:

    Nos conectamos a nuestra instancia local:

    Creamos una base de datos de prueba y mostramos las bases de datos existentes:

    En nuestra nueva base de datos, creamos una tabla de prueba e insertamos un registro:

    Hacemos una consulta de prueba:

   Como se puede ver, SQL Server fue inesperadamente sencillo de instalar, configurar y utilizar, sobre todo teniendo en mente que utilizamos un sistema operativo que no es del propio Microsoft. En esta práctica aprendimos a crear tablas y hacer consultas con las herramientas de SQL Server, las cuales son muy parecidas a otras herramientas de otros RDBMS, como MySQL.

Tagged with: , ,

Getting out of the maze with A star

    A local IT company (who shall remain unnamed in this post and shall be thankful for that) was offering free tickets to this year’s Campus Party event in Mexico. To get the tickets, you needed to complete a programming challenge. Since I’ve never attended any Campus Party* and I enjoyed solving the programming challenge for the Wizeline event, I took some time to solve this one.

    Basically, the challenge was to find the optimal way out of a squared, bi-dimensional maze. Using an API, you registered for the challenge, requested a maze of size n by n, then you were supposed to find the optimal path to get out of it (using a program, of course) and then you had to submit your path as a list of (x,y) positions, starting at (0,0) and finishing at a position where a goal (designated by “x”) was placed. An example of a small 8×8 maze would be something like this:

    The zeroes are obstacles or walls, the ones are clear paths. So the solution in the previous example is obvious: (0,0), (0,1), (0,2), (0,3), (1,3), (2,3), (2,2), (3,2), (4,2), (4,3), (4,4), (5,4), (6,4), (7,4), (7,5), (7,6), (7,7).

    Here is another example:

    Which is still obvious. However, when I began to request mazes of bigger sizes I noticed the full complexity of the problem: There were many bifurcations and dead-ends. Of course, mazes are supposed to be confusing, that is the whole point of their existence. And not only that, I had to submit the shortest path from start to finish. That meant I could not use a brute-force method to find the goal by walking every possible path in the maze until I found it. The best part was that I had to solve a 1000×1000 maze to claim the prize.

    A 1000×1000 maze might not sound very big, but once you think about all the possible configurations in that space, you realize it is not an easy task. Thankfully, getting out of mazes is a very old problem, pioneered by Cretan kings who wanted to hide away their funny-looking stepsons. For that reason, a lot of smart people have spent a lot of time trying to find the best solution to such a problem, better known as the “shortest path problem”. Among those people was Edsger W. Dijkstra, a dutch mathematician and a computer scientist who rarely used a computer. Dijkstra is one of the elder gods of computer science and now spends his afterlife looking disapprovingly at students who use GOTO statements.

    In 1959, Mr. Dijkstra successfully designed an eponymous algorithm to find the shortest path between two points in any structure where there could be obstacles, varying distances, bifurcations, and dead-ends. This algorithm (or a similar algorithm, at least) is what mapping software uses to recommend trajectories (I believe Google maps uses Contraction hierarchies since they need and can pre-compute routes in order to improve execution times).

    One of these variations of Dijkstra’s algorithm is the A* algorithm (pronounced “A Star”). It was created in 1968 by Peter Hart, Nils Nilsson and Bertram Raphael, all of them Stanford scientists. A*, in turn, has many variations.

    So, I used an implementation of the A* algorithm to successfully find the shortest path in the 1000×1000 maze. I sent my solution and even when the API itself confirmed it was the optimal path to exit the maze, I never got the prize, not even a reply saying “Somebody solved it before you” or “We ran out of prizes”. I found that very unprofessional and irritating since the rules specifically said to send an email to a certain person notifying about the solution.

    Since the Campus Party is now over and I am still a little salty about being ignored, I uploaded my solution to my Github repository. It is a very quick and dirty solution, but it works, so don’t laugh too much (or better yet, improve it and create a pull request). Thankfully, I learned many interesting things and had fun doing this programming exercise so it was not a complete waste of time.

 

* The total number of actual parties I’ve attended tends to zero.

Tagged with: , ,

DevOps crash course at Wizeline

Prelude

    A few weeks ago I was contacted by a friend who used to work with me. He told me that he was now working at Wizeline, a relatively new IT company. I had heard of them before and was intrigued. He also told me they would be sponsoring a two-day DevOps crash course and invited me to attend.

    To be honest, my interest in DevOps is minimal. I still think the area needs to become much more mature to get the recognition it deserves. Precisely for this reason is that I was interested in attending. I wanted to learn more about it and get a more informed opinion about the professionals in the DevOps world and what I can learn from them. Turns out, I can learn a lot.

    The course was part of the Wizeline Academy initiative. An effort from Wizeline to impart knowledge to their employees and to other people in the community. An effort that has been impressive so far. They have organized several courses (including free meals and swag for the attendees) and also sponsored meetups for the local communities of programmers, project managers, and DevOps professionals. Hopefully they will soon organize a data science course like the one they did in Mexico City.

    The lecturer was Kennon Kwok, a customer architect working at Chef itself, who regularly provides this course. I researched Kennon and was very impressed with his resume. Seriously, take a look at this.

    The most fun part of this was the application to the course. It was not enough to send your information. Since Wizeline’s offices are not that big and it was clear there would be much interest in the course, they implemented a “capture the flag” challenge to narrow the attendees to a final list of just 25. It was very similar of other challenges I’ve seen from IT recruiters. It involved following directions and clues that only led to more difficult clues. For example, most of the instructions were encrypted, so you had to decrypt them first and then understand the clue. I really wanted to explain the whole process and how I solved it, but I think it is better when people solve these things on their own. Besides, the challenge was not that hard. It had just the right amount of difficulty to keep it interesting but still make it challenging. To be honest, solving it was the most fun I’ve had in months. During the course I heard that hundreds of people applied, but only a few of us solved it all.

    A few days after I completed the challenge I was contacted by Wizeline and asked if I would be able to attend and also asked me a few questions just to confirm that my English was good enough to understand a native speaker like Kennon. At this point I was really excited about the course. The fact that I earned my attendance with the resolution of the challenge made me look forward to it. However, the challenge was much more aimed to programming and hacking skills, not DevOps.

    A couple of days before the course, the local DevOps meetup took place precisely at the Wizeline offices. One of the lecturers was Kennon. The other one was Basilio Briceño, another DevOps engineer who has a lot of experience. Attending this meetup was probably overkill since I already was going to attend the course in a few days, but I wanted to be there since I wanted to know the Wizeline offices before the training began. Besides, the DevOps meetup is organized by Emerson, who has been a great friend for years, and I also got to see many people who used to work with me. This meetup was sponsored both by Wizeline and Epam, which is surprising since both companies are competitors and lately have been in an unspoken war trying to hire as many people as possible. It is nice to see that rival companies can collaborate trying to improve the local community.

 

The course

    I arrived early and got a healthy breakfast that was provided by Wizeline. I realized we where almost 30 people attending the course and I only knew one of them. This was not surprising, after all, the challenge required some programming skills, and most people I know have gravitated towards the DevOps/Sysadmin path. There was also a Project Management group having breakfast with us. They were there for another of the Wizeline Academy courses.

    I won’t go into much detail about what we actually did in the course, but I will just mention that we played with recipes, berkshelf, resources, cookbooks, tests, virtual machines and containers. I think we all learned a lot, even people who already had some experience with Chef. Kennon was very patient but also challenged us to think outside the box and try our own solutions. The Wizeline DevOps team was also present, trying to learn from Kennon and serve as guides to the rest of us.

 

Conclusions

    At the end of the course we had a little “graduation ceremony” where we received a symbolic document about our accomplishment. We also got a free Wizeline t-shirt and a couple of stickers. There was also a little party with free drinks and snacks (which is something Wizeline employees enjoy permanently). Alas, I had to attend a concert so I had to leave early. You can read all this from Kennon’s point of view in his linkedin post.

    Overall, I am very satisfied about this experience. Everything was very enjoyable and interesting, from the selection process and the logistics, to the actual course and the aftermath. I could tell that a lot of effort went into this and at times I could not believe I was having all this for free. As I said before, hopefully Wizeline (and other companies) will continue organizing and sponsoring these events. It is obvious that it helps the IT community, which in turn raises the level of knowledge and abilities of all of us, and this is extremely beneficial to companies like Wizeline.

Tagged with: ,

Trip to Las Vegas

    In the last post I wrote about my work in Las Vegas, but I did not write about two fundamental aspects: The city and its people.

    I have been in Vegas before, but it has been too long since then. I don’t remember what I did last time or where I went. This time, the city seemed new and old at once.

    Las Vegas is where decorum goes to die. In the Sin City everybody can easily buy drugs, sex and probably many other things that I am too innocent to know. Of course, doing any of this is illegal, but it is illegal just as talking about a taboo is. Nobody wants to discuss what is happening, but everybody knows it happens, and they know this because they are part of it.

    I am still surprised about the homeless people that adorn the streets. I have come to expect to see them, but somehow I am still surprised. I remembered the trips to NYC and how out of place the homeless looked near the luxurious stores in midtown Manhattan. They also look out of place in Vegas, but the scenario now was The Strip, surrounded by humongous casino resorts that seem conceived in the mind of a Saudi princess during an acid trip.

    And the girls… I could write a whole book about that. Girls in the US are so different to Mexican girls. For instance, they are overloaded with confidence an ego. And this is good. Mexican girls tend to pretend they are not interested and will rarely take the initiative, but all the girls I met in Las Vegas were the complete opposite of that: they approach you, they talk to you, they flirt with you, they take what they want and they do not care about anything else.

    I know what you are thinking: “But Alan, I am pretty sure they were prostitutes!”, and I see your point. Prostitutes in Las Vegas roam freely everywhere. Some of them are very obvious, wearing transparent cocktail dresses, fishnet stockings, and heels that double their height. But some of them are incredibly classy ladies that would not look out of place in a Nobel prize award ceremony. You can try to identify them and after a few days you will get good at it, but you will never know for sure.

    So, how do I know the girls were flirting with me and not my money? I know because all the girls who approached me were either staying in the hotel or working in the same place I was. In fact, two of them were staying a few doors from my room. Another one was a make-up artist who fixed my hair for some photographs. I was so focused on the posing that when she asked how I wanted her to do my hair, I told her to do anything she wanted with me. She told me that I should not say that. While saying that, she looked at me in a way that made me understand that she was already thinking way too many things that she could do with me.

    I don’t want to sound like I am bragging about all this. I am writing this because these are perfect examples of how people behave in Vegas: They come here to have fun and do not care about anything else. They drink too much, they have sex with strangers, they take drugs, they bet all of their savings and they do not care, because “What happens in Vegas, stays in Vegas”. Still don’t believe me? Go and ask that girl I saw while having breakfast, the one who wore only a pair of slippers and a tiny hand towel.

    In fact, all this made me think about prostitution in Vegas. As I said before, hiring a prostitute is extremely easy, you get flyers advertising “full experience massages”, you see ads in the streets and magazines promising girls from every possible ethnicity and their combinations, wearing any kind of outfit your pervert mind could envision. However, I wonder what kind of person would hire them while the other girls are so happy to take that role free of charge. I guess Cyndi Lauper was right: Girls (and boys) just want to have fun, and Vegas is the place to have fun, any kind of fun.

Tagged with: ,
Top