Language:EN
Pages: 2
Words: 451
Rating : ⭐⭐⭐⭐⭐
Price: $10.99
Page 1 Preview
copy and from virtual linux machine into hdfs hado

Copy and from virtual linux machine into hdfs hadoop put

1. In this assignment, you will work with part of the MovieLens dataset. I selected a 2 files for this assignment which can be found in movielens.zip
The movielens dataset is a collection of movie ratings data and has been widely used in the industry and academia for experimenting with recommendation algorithms and we see many publications using this dataset to benchmark the performance of their algorithms.

For access to full-sized movielens data, go to http://grouplens.org/datasets/movielens/

-----------------------------
-- Table description "u.data"
-- 
-- field_1 userid
-- field_2 movieid
-- field_3 rating
-- field_4 unixtime
-----------------------------

--> u.item -- Information about the items (movies).The file has 24 pipe ("|") separated columns. this is a list of:

7. Create one table for u.data and one table for u.item in Hive and load the data.

------------------------
-- Assignment Questions
------------------------

5. Find the highest rated sci_fi movie. Explain how you define "highest rating".

BONUS: Are there any movies with no ratings? (Hint: outer join and IS NULL)

Screenshot of the result

2. Submit using Assessment -> Assignments -> Hive Assignment 1

You are viewing 1/3rd of the document.Purchase the document to get full access instantly

Immediately available after payment
Both online and downloadable
No strings attached
How It Works
Login account
Login Your Account
Place in cart
Add to Cart
send in the money
Make payment
Document download
Download File
img

Uploaded by : Benjamin Farmer

PageId: DOCE7E77EE