Before an analyst begins collecting data, they must answer three questions first:
• What’s the goal or purpose of this research/Project?
• What kinds of data are they planning on gathering?
• What methods and procedures will be used to collect, store, and process the information?
Additionally, we can break up data into qualitative and quantitative types.
• Qualitative data covers descriptions such as color, size, quality, and appearance.
• Quantitative data, unsurprisingly, deals with numbers, such as statistics, poll numbers, percentages, etc.
Data Collection Methods
• The two methods are:
• Primary
• As the name implies, this is original, first-hand data collected by the data researchers. This process is the initial information gathering step, performed before anyone carries out any further or related research.
• Primary data results are highly accurate provided the researcher collects the information. However, there’s a downside, as first-hand research is potentially time-consuming and expensive.
There are different methods to collect primary data
- Interviews.
- Projective Technique.
- Observation
- Focus Groups.
- Questionnaires
- Delphi Technique.
• Secondary
• it’s second-hand information.
• This data is either information that the researcher has tasked other people to collect or information the researcher has looked up
• it’s easier and cheaper to obtain than primary information, secondary information raises concerns regarding accuracy and authenticity.
Quantitative data makes up a majority of secondary data.
since the information has already been collected, the researcher consults various data sources, such as:
- Financial Statements
- Sales Reports
- Retailer/Distributor/Deal Feedback
- Customer Personal Information (e.g., name, address, age, contact info)
- Business Journals
- Government Records (e.g., census, tax records, Social Security info)
- Trade/Business Magazines
- The internet
What is Data Pre-processing?
• Data Pre-processing is that step in which the data gets transformed, or Encoded, to bring it to such a state that now the machine can easily parse it. In other words, the features of the data can now be easily interpreted by the algorithm.
• Data pre-processing is a data mining technique that involves transforming raw data into an understandable format. Real-world data is often incomplete, inconsistent, and/or lacking in certain behaviours or trends, and is likely to contain many errors. Data preprocessing is a proven method of resolving such issues.
• In the real world data are generally incomplete: lacking attribute values, lacking
certain attributes of interest, or containing only aggregate data.
• Noisy: containing errors or outliers.
• Inconsistent: containing discrepancies in codes or names.
• When we talk about data, we usually think of some large datasets with a huge number of rows and columns. While that is a likely scenario, it is not always the case.
data could be in so many different forms: Structured Tables, Images, Audio files, Videos, etc..
• Machines don’t understand free text, image, or video data as it is, they understand 1s and 0s.
• So it probably won’t be good enough if we put on a slideshow of all our images and expect our machine learning model to get trained just by that!
Download link for more Detail: Data Collection and Data Pre-Processing
Or follow my blog from the below link
Also, Join my Telegram channel with the below link
Also, join my Whatsapp group with the below link
0 Comments