Exam Details

  • Exam Code
    :E20-007
  • Exam Name
    :Data Science and Big Data Analytics
  • Certification
    :EMC Certifications
  • Vendor
    :EMC
  • Total Questions
    :198 Q&As
  • Last Updated
    :Apr 07, 2025

EMC EMC Certifications E20-007 Questions & Answers

  • Question 121:

    Data visualization is used in the final presentation of an analytics project. For what else is this technique commonly used?

    A. Assessing data quality

    B. Descriptive statistics

    C. ETLT

    D. Model selection

  • Question 122:

    Consider a database with 4 transactions:

    Transaction 1: {cheese, bread, milk}

    Transaction 2: {soda, bread, milk}

    Transaction 3: {cheese, bread}

    Transaction 4: {cheese, soda, juice}

    You decide to run the association rules algorithm where minimum support is 50%. Which rule has a

    confidence at least 50%?

    A. {cheese} => {bread}

    B. {juice} => {cheese}

    C. {milk} => {soda}

    D. {soda} => {milk}

  • Question 123:

    What describes a true limitation of a Logistic Regression method?

    A. Does not handle missing values well

    B. Does not handle redundant variables well

    C. Does not handle correlated variables well

    D. Does not have explanatory values

  • Question 124:

    Your company has 3 different sales teams. Each team's sales manager has developed incentive offers to increase the size of each sales transaction. Any sales manager whose incentive program can be shown to increase the size of the average sales transaction will receive a bonus.

    Data are available for the number and average sale amount for transactions offering one of the incentives as well as transactions offering no incentive.

    The VP of Sales has asked you to determine analytically if any of the incentive programs has resulted in a demonstrable increase in the average sale amount. Which analytical technique would be appropriate in this situation?

    A. One-way ANOVA

    B. Multi-way ANOVA

    C. Student's t-test

    D. Wilcoxson Rank Sum Test

  • Question 125:

    What is an appropriate data visualization to use in a presentation for an analyst audience?

    A. Pie chart

    B. Area chart

    C. Stacked bar chart

    D. ROC curve

  • Question 126:

    Refer to the exhibit.

    You have run a linear regression model against your data, and have plotted true outcome versus predicted outcome. The R-squared of your model is 0.75. What is your assessment of the model?

    A. The R-squared may be biased upwards by the extreme-valued outcomes. Remove them and refit to get a better idea of the model's quality over typical data.

    B. The R-squared is good. The model should perform well.

    C. The extreme-valued outliers may negatively affect the model's performance. Remove them to see if the R-squared improves over typical data.

    D. The observations seem to come from two different populations, but this model fits them both equally well.

  • Question 127:

    In linear regression, what indicates that an estimated coefficient is significantly different than zero?

    A. A small p-value

    B. R-squared near 1

    C. R-squared near 0

    D. The estimated coefficient is greater than 3

  • Question 128:

    You are using the Apriori algorithm to determine the likelihood that a person who owns a home has a good credit score. You have determined that the confidence for the rules used in the algorithm is > 75%. You calculate lift = 1.011 for the rule, "People with good credit are homeowners". What can you determine from the lift calculation?

    A. Support for the association is low

    B. Leverage of the rules is low

    C. The rule is coincidental

    D. The rule is true

  • Question 129:

    A data scientist is asked to implement an article recommendation feature for an on-line magazine. The magazine does not want to use client tracking technologies such as cookies or reading history. Therefore, only the style and subject matter of the current article is available for making recommendations. All of the magazine's articles are stored in a database in a format suitable for analytics.

    Which method should the data scientist try first?

    A. K Means Clustering

    B. Naive Bayesian

    C. Logistic Regression

    D. Association Rules

  • Question 130:

    What is an appropriate data visualization to use in a presentation for a project sponsor?

    A. Bar chart

    B. Pie chart

    C. Box and Whisker plot

    D. Density plot

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only EMC exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your E20-007 exam preparations and EMC certification application, do not hesitate to visit our Vcedump.com to find your solutions here.