Tuesday, 26 January 2021

ASDM Assignment: Data Mining using


 Learning Outcomes of this Assessment  

A1 - Critically assess diverse issues regarding the use of data mining and machine learning in real-world contexts, including ethics;  

B1 - Design, build and use business intelligence systems, justifying decisions made;  

B2 - Design and create reports to present analytical and interpretative information in creative and effective ways;  

B3 - Devise strategies for making effective use of analytical software such as SAS Enterprise Miner. 

B4 - Learn about different algorithms, such as classification and clustering. 

 

 

Key Skills to be Assessed 

  • Diverse issues regarding the use of data mining techniques to real-world datasets 

  • Discover patterns within a dataset using exploratory data analysis 

  • Use of SAS for data mining 

  • Discover techniques to leverage R’s features, and work with packages  

  • Reporting and presentation of analytical and interpretative information 

 

The Assessment Task 

 

Overview 

This coursework will give you the opportunity to use the techniques covered in this module to organise and analyse a collection of data that interests you and to draw conclusions based on your analysis and finally to present your results in the form of a report. 

 

 

Explanation 

 

You must apply Classification, Association Rules MiningClustering and Text Mining as your approaches. This means that you will need to choose a dataset that is amenable to each of these types of data mining -- i.e., to building a model that will determine, predict, or estimate one of the attributes in the dataset, based on the values of other attributes.  

 

he overall requirement is summarised below:  

 

Task 1: Apply classification on a selected dataset using R & SAS Enterprise Miner (e.g., Decision Tree, K-Nearest Neighbours, Logistic Regression). (25 Marks) 

 

Task 2: Apply Association Rules Mining on a selected dataset using R & SAS Enterprise Miner(25 Marks) 

 

Task 3: Apply K-Means Clustering on a selected dataset using R & SAS Enterprise Miner(20 Marks) 

 

Task 4: Apply Sentimental Analysis on selected 20 hotels in the hotel_reviews.csv dataset Using R & SAS EM (dataset can be downloaded from the Blackboard) (20 Marks) 

 

 

Extra feature to be implemented (10 Marks) 

 

Page Break 

Key elements of the report 

 

Title 

The title should provide an overview of the focus of your problem and the expected solution. 

 

Introduction 

This section contains a brief background to the topic and leads to the formulation of the specific question, based on your selected topic. The research question must be focused and clear. 

 

Datasets 

You are welcome to choose any datasets that interest you, and that has enough data to enable meaningful analysis. In making your choice, you should be sure to consider what problem or problems you would be able to solve by employing data mining on the dataset. In other words, you should ask yourself: How could I use data mining to answer one or more questions about the datasets? 

 

Explanation and preparation of datasets 

Briefly describe the datasets you have used, independent and dependent variables. Explain any preparation tasks (e.g., normalisation, dealing with missing values, etc.) carried on the datasets. 

 

Data mining using SAS Enterprise Miner 

Your explanation must include all of the following: 

The application of data-mining techniques to selected datasets that you choose using SAS Enterprise Miner. 

Train and test your model. 

Use visualisation tools available in SAS Enterprise Miner. 

 

 

Implementation in R 

Implement your proposed approach using package(s) available in R programming. This section will include: 

A brief description of the R package(s) used. 

The application of data-mining techniques to selected datasets that you choose using R. 

Explanation of the experimental procedure, including the setting and optimisation of model parameters during training.  

Visualisation of the results. 

 

 

Results analysis and discussion 

Explain and justify the performance metric you choose to use to evaluate the model(s). 

A clear and compelling presentation of the results that you obtain, both from the data mining and any other analysis that you may perform.  

Compare and discuss the results obtained from R implementation with the results obtained in SAS Enterprise Miner. 

 

Conclusions 

The key points from the assignment must be synthesised within the conclusion. This must relate back to the introduction and the research question and provide an overall evaluation of the validity of the solution you have proposed. 

 

References 

You will list all publications referenced in the report. You should show evidence of sufficient readings related to your work. References must follow the Harvard formatting system as in this guide:http://www.salford.ac.uk/library/help/user-guides/general/Bibliographic-Citations-APA-QuickRef-Apr2015.pdf 

 

Appendices 

Appendices may be used to provide relevant supporting evidence for reference but should only be used if necessary. Students may wish to include in appendices, evidence which confirms the originality of their work or illustrates points of principle set out in the main text. 

 

Equipment and Facilities to be Used 

The university laboratory computers are installed with SAS and R. 

 

Workload 

This assessment should require approximately 120 hours of effort. 

 

 

 

Marking scheme 

The work will be assessed using a marking grid comprising weighted components (provided below). This is indicative of the standard of work required at different levels within the assignment. 

 

Assessment criteria 

 

 

Overall level 

Unsatisfactory below 50 (Fail) 

 

Pass 50-59 

 

Good 60-69 (Merit) 

Very Good 70 and above(Distinction) 

 

Title, Introduction, Conclusion (20%) 

Uninformative title, vague introduction, unreliable conclusion 

Satisfactory title, introduction well defines the studied problem and the intended tasks, clear conclusion 

Informative and attractive title, clear setting of the scene and boundaries of the report in introduction, conclusion drawn persuasively from results analysis and discussion. 

Concise and appealing title, introduction presents an excellent clarity of focus of the report, conclusions are reliable and can be trustfully used by users. 

Explanation of datasets, References (10%) 

Insufficient collection of primary information, datasets are barely explained. Inadequate attempt made at proper referencing, many errors/omissions 

Adequate engagement with relevant information collection, reasonable fraction from primary sources. Adequate dataset explanation. Acceptable attempt made at proper referencing, with a number of errors/omissions. 

Good information collection, relevant to the assignment, significant fraction from primary sources.  Datasets clearly explained. Referencing good, but with some errors and omissions.  

Information collection of very high standard, relevant to assignment and mostly from primary sources. Concise and informative dataset explanation. Referencing almost perfect. 

Data mining using SAS (20%) 

Insufficient collection of primary information,. No attempt on applying DM techniques using SAS. 

DM methods applied to the dataset,  and results are shown but no explaining of the DM process and/or results obtained.   

DM methods applied to the dataset,  and results are shown with the explanation of  the mining process and of the  results obtained 

DM methods applied to the dataset,  and results are shown with the explanation of  the mining process and of the  results obtained. A good discussion on the results, comparing R with SAS. Referencing almost perfect. 

Implementation in R (30%) 

Experimental implementation and setup is lacking detail, little or no relevant description and discussion of relevant package and functions, and no critique of designs. 

Basic descriptions of experiments, design, and statistics that could be conducted, relevant literature is lacking, little or no critique. 

Good descriptions of experiments, design, statistics that could be conducted, some relevant literature and basic critique. 

Detailed descriptions of experiments, design, statistics that could be conducted, relevant literature and critique. 

Results analysis and discussion (20%) 

Results are not presented professionally, little or no results analysis and discussion 

Results are presented using proper means such as tables and graphs, results analysis and discussion is general and shallow. 

Results are clearly and informatively presented. Results analysis and discussion are specific and sufficient. 

Results are professionally presented at standard of a journal publication. Results are critically analysed and discussed. Valuable observation and finding are made from the results. 

 

 

UK assignment helper

Author & Editor

We are the best assignment writing service provider in the UK. We can say it with pride that we tend to perceive our client’s requirements better than any other company. We provide assignment writing service in 100+ subjects.

0 comments:

Post a Comment