Exam Details

  • Exam Code
    :E20-007
  • Exam Name
    :Data Science and Big Data Analytics
  • Certification
    :EMC Certifications
  • Vendor
    :EMC
  • Total Questions
    :198 Q&As
  • Last Updated
    :Apr 07, 2025

EMC EMC Certifications E20-007 Questions & Answers

  • Question 101:

    Assume you are performing an analysis to determine fraud detection on credit card usage. You will need to ensure higher-risk transactions. These may indicate that fraudulent credit card activity is retained in your data for analysis and not dropped as outliers during pre- processing.

    What is the approach for loading data into the analytical sandbox for this analysis?

    A. ELT

    B. ETL

    C. EDW

    D. OLTP

  • Question 102:

    When is the GROUP BY ROLLUP clause used in an OLAP query?

    A. All subtotals and grand totals are to be included in the output

    B. Subtotals are only to be included in the output

    C. Grand totals are only to be included in the output

    D. Specific subtotals and grand totals for a combination of variables are only to be included in the output

  • Question 103:

    Consider a scale that has five (5) values that range from "not important" to "very important". Which data classification best describes this data?

    A. Ordinal

    B. Nominal

    C. Real

    D. Ratio

  • Question 104:

    When is a Na飗e Bayesian Classifier model for classification preferred versus a Logistic Regression model?

    A. When using several categorical input variables with over 1000 possible values each

    B. When an estimate of the probability of an outcome is needed, not just which class it is in

    C. When all input variables are numerical

    D. When some of the input variables might be correlated

  • Question 105:

    What is a property of window functions in SQL commands?

    A. They can be used to calculate moving averages over various intervals.

    B. They group rows into a single output row.

    C. They can be used between the keywords FROM and WHERE in a SELECT command.

    D. They don't require ordering of data within a window.

  • Question 106:

    Based on the exhibit,

    what is a likely issue with the data?

    A. Saturated data; indicating potential issues with data definitions

    B. Incomplete data; indicating potential issues with data transmission

    C. Mis-scaled data; indicating potential issues with data entry

    D. No obvious concerns with the data is visible

  • Question 107:

    The Marketing department of your company wishes to track opinion on a new product that was recently introduced. Marketing would like to know how many positive and negative reviews are appearing over a given period and potentially retrieve each review for more in- depth insight. They have identified several popular product review blogs that historically have published thousands of user reviews of your company's products.

    You have been asked to provide the desired analysis. You examine the RSS feeds for each blog and determine which fields are relevant. You then craft a regular expression to match your new product's name and extract the relevant text from each matching review.

    What is the next step you should take?

    A. Convert the extracted text into a suitable document representation and index into a review corpus

    B. Use the extracted text and your regular expression to perform a sentiment analysis based on mentions of the new product

    C. Read the extracted text for each review and manually tabulate the results

    D. Group the reviews using Na飗e Bayesian classification

  • Question 108:

    What describes a true property of Logistic Regression method?

    A. It is robust with redundant variables and correlated variables.

    B. It handles missing values well.

    C. It works well with discrete variables that have many distinct values.

    D. It works well with variables that affect the outcome in a discontinuous way.

  • Question 109:

    Refer to the exhibit.

    What provides the decision tree for predicting whether or not someone is a good or bad credit risk. What would be the assigned probability, p(good), of a single male with no known savings?

    A. 0.83

    B. 0

    C. 0.498

    D. 0.6

  • Question 110:

    Assume that you have a data frame in R. Which function would you use to display descriptive statistics about this variable?

    A. summary

    B. str

    C. attributes

    D. levels

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only EMC exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your E20-007 exam preparations and EMC certification application, do not hesitate to visit our Vcedump.com to find your solutions here.