Friday, 15 May 2020

Project report and video on data processing and analytics topics

Page 1 of 5
ASSESSMENT TASK
This is a group assignment with individual elements where you are asked to design and implement databases for selected use cases and perform data analytics on given datasets.
A. The first part of the assignment focuses on your understanding and implementation of various database technologies.
Choose one of the given database development scenarios and:
1. Develop an Entity-Relationship model of the information requirements for this scenario.
2. Translate your model into an equivalent set of relations. Specify all relation headings, indicating primary and foreign keys.
3. Implement at least three related entities from your database in SQL, MongoDB and Neo4J (could be different three entities for each technology) and generate sample data for all of the implementations. Show samples of generated data.
4. Explain why you chose the particular entities to implement with each technology.
5. Come up with 5 use cases for each database technology and implement those using queries. Show the queries code and the output.
Include the answers to each of the five points above in your technical report.
B. In an era of growing data complexity and volume, feature selection and construction techniques play a key role in understanding our data in helping reduce the dimensionality and improve learnability in data analytics problems. Both for data and big data processing and analytics feature selection techniques are important for reducing the time required to build machine learning models and improving the performance of these algorithms. Moreover, principal component analysis is an important algorithm used in data and big data processing for the purpose of data visualisation, as well as for dimensionality reduction and for gaining insight in the knowledge hidden in the data.
For the submission, you are given 3 datasets and you are asked to define the classification problem of your choice (you can group classes if you think it is appropriate), select one dataset and perform the following tasks:
1. Define the training and testing set for your dataset.
2. Apply two classification algorithms of your choice and compare the performance of them for the dataset your choice.
3. Apply Principal Component Analysis to the dataset and explain its outcome. How does the number of principal components affect the percentage of variance covered for this dataset?
4. Apply a feature selection algorithm of your choice and compare the performance of one algorithm of your choice (it can be one of the two you used in question 2) before and after applying the feature selection algorithm.
Faculty of Science and Technology - Department of Computing and Informatics
Unit Title: Data Processing and Analytics
Assessment Title: Project report and video on data processing and analytics topics
Unit Level: 7
Assessment Number: 1 of 1
Credit Value of Unit: 20
Date Issued: 13/01/2020
Marker(s): Theodoros Kostoulas and Rashid Bakirov
Quality Assessor: Hamid Bouchachia
Submission Location: Turnitin
Feedback method: Turnitin
This is group assignment with individual elements which carries 100% of the final unit mark
January 2020 v1
Page 2 of 5
5. Discuss the challenges and implications regarding the time required to build the required
models. Compare those times when using feature selection algorithms and when no such
algorithms are used.
You must use significance test on the test set, to evaluate the results. Use the AUC (area under
curve) as a metric for comparing the performance. Include the answers to each of the five points
above in your technical report.
Groups
• It is your responsibility to decide whom you would like to work with. You also need to be prepared
for any unforeseen circumstances that may arise when working as a team. You should inform the
unit leader via email about your group members within three weeks of the issue date of this
coursework. Otherwise, the remaining students will be randomly assigned into groups.
• Each group should consist of four members. If you would like to form a group of a different size
(minimum: 3, maximum: 5), you can do this in exceptional circumstances only, and you must
explain these circumstances in your report. In that case, you will be still marked as if you worked
in a group of four. That is, there will be no compensation considered during marking.
• We assume that all members in the group will receive equal marks for the technical report (80%)
of the assignment. If group members do not agree with this default mark distribution and raise an
issue (e.g. somebody not-engaged or made a poor contribution), the unit leader will contact the
group members via email to resolve the issue and marks can be reallocated.
SUBMISSION FORMAT
There will be two deliverables for this assignment:
1. Technical report (80%, equivalent of 6000 words) - Group submission. The technical report needs
address both tasks A and B.
2. Individual video (20%, equivalent of 600 words) – Individual submission. Prepare a five minute
video, where you discuss your role in the group, your contribution to the final submission and the
steps that you followed for completing the tasks. The links to the video of every group member
should be included in the project report.
MARKING CRITERIA
The following criteria will be used to assess the assignment:
Technical report:
Task A: 40%
Subtask HIGH MARK MEDIUM MARK LOW MARK
1. Entity
Relationship
Model (10%)
All or most of the
required entities,
attributes, relations
and cardinalities
(including the
optional participation)
are correctly
identified and
described.
Many of the required entities,
attributes, relations and
cardinalities (including the
optional participation) are
correctly identified and
described. ERD generally
makes sense.
Few of the required entities,
attributes, relations and
cardinalities (including the
optional participation) are
correctly identified and
described. ERD does not
address the scenario.
2. Relational
Schema
(5%)
ERD is mostly
correctly translated
into relational
schema, including
the specification of
primary and foreign
keys.
ERD is translated into
relational schema, with some
mistakes and some missing
keys.
ERD is translated into
relational schema, with
many mistakes and missing
keys.
3. Databases
and sample
Designed databases
mostly correspond to
Designed databases
generally correspond to the
Designed databases mostly
do not correspond to the
January 2020 v1
Page 3 of 5
data (10%) the scenario and the
proposed ERD.
All of the entities
have sufficient
sample data, and it is
clearly shown.
scenario and the proposed
ERD.
All of the entities have
sample data with some
omissions and ambiguities.
scenario and the proposed
ERD.
There are significant
omissions in sample data,
some or all of the entities do
not have any sample data.
4. Design
choices
explanation
(5%)
Explanation of the
choice of entities for
database
technologies is clear
and fits the
characteristics of
these technologies.
Explanation of the choice of
entities for database
technologies is generally
clear and the characteristics
of these technologies are
considered.
Explanation of the choice of
entities for database
technologies is unclear and
contradicts the
characteristics of these
technologies.
5. Use cases
and queries
(10%)
Use cases are clear
and the queries
address them well.
Queries are correct
and deliver valid
results, which are
clearly shown.
Use cases are generally
clear and the queries
address them appropriately
for most of the cases. Many
queries are correct and
deliver some results, which
are shown.
Use cases are unclear and
the existing queries do not
address them appropriately
for most of the cases.
Queries are
incorrect/missing and the
results are not shown well.
Task B: 40%
Subtask HIGH MARK MEDIUM MARK LOW MARK
Introduction
(5%)
Detailed and
concise to the
problem.
Generally clear but not to
the point.
Not relevant or inadequate.
System
description
(10%)
Detailed and
elaborated.
Vague and not described
in detail.
Not detailed, with mistakes
or parts missing.
Experimental
setup (10%)
Detailed and
elaborated.
Vague and not described
in detail.
Not detailed, with mistakes
or parts missing.
Experimental
results (5%)
Detailed and
elaborated.
Vague and not described
in detail.
Not detailed, with mistakes
or parts missing.
Discussion on
the results
(10%)
Thorough
elaboration on the
results with
meaningful
explanation.
Vague and not discussed
in detail.
Not discussed or analysed
enough.
Individual video 20%:
Subtask HIGH MARK MEDIUM MARK LOW MARK
Clarity of
explanation (10%)
Detailed and
concise to the
problem.
Generally clear but not to
the point.
Not relevant or inadequate.
Connection to
the tech report
( (5%)
Detailed and
elaborated.
Vague and not described
in detail.
Not detailed, with mistakes
or parts missing.
Summary and
c conclusions (5%)
Thorough
elaboration on the
results with
meaningful
explanation.
Vague and not discussed
in detail.
Not discussed or analysed
enough.
January 2020 v1
Page 4 of 5
LEARNING OUTCOMES
This assignment tests your ability to:
1. Perform and critically analyse data modelling.
2. Understand the underlying technology of various database systems.
3. Gain critical understanding of Data analytics’ challenges.
4. Gain critical understanding of the most significant pattern recognition algorithms for dealing with Data
and Big Data.
5. Be able to interpret the results from Data and Big Data analytics’ algorithms and use the appropriate
methods for reporting the results.
QUESTIONS ABOUT THE BRIEF
Any issues about the assignment can be raised with the lecturer during lectures/seminars or by
appointment. Email will be used for handling questions about the brief when no seminar/lab session is
scheduled between the time the questions arise and the submission deadline.
Signature Marker Rashid Bakirov and Theodoros Kostoulas
HELP AND SUPPORT
• If a piece of coursework is not submitted by the required deadline, the following will apply:
1. If coursework is submitted within 72 hours after the deadline, the maximum mark that can be
awarded is 50%. If the assessment achieves a pass mark and subject to the overall performance
of the unit and the student’s profile for the level, it will be accepted by the Assessment Board as
the reassessment piece. The unit will count towards the reassessment allowance for the level;
This ruling will apply to written coursework and artefacts only; This ruling will apply to the first
attempt only (including any subsequent attempt taken as a first attempt due to exceptional
circumstances).
2. If a first attempt coursework is submitted more than 72 hours after the deadline, a mark of zero
(0%) will be awarded.
3. Failure to submit/complete any other types of coursework (which includes resubmission
coursework without exceptional circumstances) by the required deadline will result in a mark of
zero (0%) being awarded.
The Standard Assessment Regulations can be found on Brightspace.
• If you have any valid exceptional circumstances which mean that you cannot meet an assignment
submission deadline and you wish to request an extension, you will need to complete and submit the
Exceptional Circumstances Form for consideration to your Programme Support Officer (based in
C114) together with appropriate supporting evidence (e.g, GP note) normally before the
coursework deadline. Further details on the procedure and the exceptional circumstances form can
be found on Brightspace. Please make sure that you read these documents carefully before
submitting anything for consideration. For further guidance on exceptional circumstances please see
your Programme Leader.
• You must acknowledge your source every time you refer to others’ work, using the BU Harvard
Referencing system (Author Date Method). Failure to do so amounts to plagiarism which is against
University regulations. Please refer to http://libguides.bournemouth.ac.uk/bu-referencing-harvardstyle
for the University’s guide to citation in the Harvard style. Also be aware of Self-plagiarism, this
primarily occurs when a student submits a piece of work to fulfill the assessment requirement for a
particular unit and all or part of the content has been previously submitted by that student for formal
assessment on the same/a different unit. Further information on academic offences can be found on
January 2020 v1
Page 5 of 5
Brightspace and from https://www1.bournemouth.ac.uk/discover/library/using-library/howguides/
how-avoid-academic-offences
• Students with Additional Learning Needs may contact Learning Support on
www.bournemouth.ac.uk/als
Disclaimer: The information provided in this assignment brief is correct at time of publication. In the
unlikely event that any changes are deemed necessary, they will be communicated clearly via e-mail and
Brightspace and a new version of this assignment brief will be circulated.

UK assignment helper

Author & Editor

We are the best assignment writing service provider in the UK. We can say it with pride that we tend to perceive our client’s requirements better than any other company. We provide assignment writing service in 100+ subjects.

0 comments:

Post a Comment