+1-617-874-1011 (US)
+61-7-5641-0117 (AU)
+44-117-230-1145 (UK)
Live Chat

ECE656 assignment 1

  1. If you have your own computer, install a copy of MySQL on it. If you do not have your own computer, let me know and I will provide access to necessary equipment.
  2. Acquire the Yelp Challenge Dataset: https://www.yelp.com/dataset_challenge/dataset
  3. The Yelp dataset consists of 5 entities: business, review, user, check-in, and tip. The data is in JSON format.
    1. Define a relational schema for these entities, including table names, attributes and appropriate types, define primary and foreign keys, and indexes as necessary.
      1. You may wish to use more than five tables for your scheme; g., the user entity has an attribute “friends” which appears to be other Yelp users.
      2. You will need to justify your relational schema.
    2. Create a database, and all associated tables necessary per your relational schema.
    3. Write a script to convert the data from JSON format to a format suitable for loading into database (g., convert it to CSV).
    4. Load the data, timing your load operations, and verifying that it loads without errors or warnings. Note how many rows are loaded in each of the tables you define.
    5. Confirm the size of each table with various “SELECT count()” queries.
  4. Basic data analysis: write and execute SQL queries that will determine:
    1. The user(s) who has written the greatest number of reviews.
    2. The business(s) that has received the greatest number of reviews.
    3. The average number of reviews written by users.
      1. Note that the user entity maintains an “average stars” rating. However, that value should also be able to be computed from the actual reviews performed by each user. Determine if the recorded “average stars” rating is consistent with the computed value.
    4. The average number of reviews received by businesses.
    5. What fraction of businesses have received more than 10 reviews?
    6. What fraction of users have written more than 10 reviews?
    7. What is the average length of their reviews?
  5. Submit a report that describes your relational schema, all commands you used to create it, load the data, and query it to get the results for part (4). Provide all results for part 4.
Improve Your Grades with Custom Writing Help
Homework Help
Writing Help
Editing Services
Plagiarism check
Proofreading services
Research Project help
Custom writing services
E learning blogs

Disclaimer : The study tools and academic assistance/guidance through online tutoring sessions provided by AssignmentHelp.Net is to help and enable students to compete academically. The website does not provide ghostwriting services and has ZERO TOLERANCE towards misuse of the services. In case any user is found misusing our services, the user's account will be immediately terminated.