How to choose the data to replace the missing values i the dataset - data-science

i like to know the approach we take for data imputation in the dataset.i have missing values in the data set . what method and on what basis we choose to replace the na values i the dataset

I can't tell you exactly what to do without knowing the data you are actually dealing with... But I assume wildcard package in R is very helpful.
Please take a look at this official document if you are using R to manipulate data.

Related

how to view stats on snowflake?

I am looking for a way to visualize the stats of a table in Snowflake.
The long step is to pull a meaningful sample of the data with python and apply Pandas, but it is somewhat inefficient and unsafe to pull the data out of snowflake.
Snowflake's new interface shows these stats graphically and I would like to know if there is a way to obtain this data with query or by consulting metadata.
I need something like Pandas-profiling but without a external server. maybe snowflake store metadata/statistic about its colums. numeric, categoric
https://github.com/pandas-profiling/pandas-profiling
thank you for your advices.
You can find a lot meta information in the INFORMATION_SCHEMA.
All the views and table functions in the Snowflake INFORMATION_SCHEMA can be found here: https://docs.snowflake.com/en/sql-reference/info-schema.html
not sure if you're talking about viewing the information schema as mentioned, but if you need documentation on this whole new interface, it's called SnowSight
you can learn more there:
https://docs.snowflake.com/en/user-guide/ui-snowsight.html
cheers!
The highlight in your screenshot isn't statistics about the data in the table, but merely about the query result (which looks like a DESCRIBE TABLE query). For example, if you look at type, it simply tells you that this table has 6 VARCHAR columns, 2 timestamps, and 1 number.
What you're looking for is something that is provided by most BI tools or data catalogs. I suggest you take a look at those instead.
You could also use an independent tool, like Soda, which is open source.

Convert table data into flat format using of Excel VBA

I have the following input data range
and the following desired output
The first block with Compals will be always the same - so 5 rows. Base Unit Current and Base Unit Later blocks will be variable - sometimes eight options and sometimes more than eight options.
I'm very new in excel vba so unfortunately I have no idea how to do that. Please, could anyone help me or give me some advice. Many thanks in advance.
Wonder you accept this kind of answer of not?
Instead of table, I think this kind of pivot may suitable your need:
I reform the table
Create the pivot table

KeywordFilter field to filter from database values

How can i implement a KeywordFilter field to filter data from the database table as soon as text is fed into the field.
Most of the samples I have come across demonstrates filtering from predefined arrays.What i am looking out for is filtering from database.
Please guide how to go about it.Thanks
I have tried out this example of BB docs which shows in arrays
A straightforward approach would be to load the data from the database into a ReadableList and pass that to the KeywordFilterField.
The method used to set the values is
setSourceList(ReadableList list, KeywordProvider helper)
ReadableList is an interface which has a few implementations. The example code you are looking at uses the SortedReadableList but a BasicFilteredList would work nicely too.

How to get multi row data of one column to one row of one Column

I need to get data in multiple row of one column.
For example data from that format
ID Interest
Sports
Cooking
Movie
Reading
to that format
ID Interest
Sports,Cooking
Movie,Reading
I wonder that we can do that in MS Access sql. If anybody knows that, please help me on that.
Take a look at Allen Browne's approach: Concatenate values from related records
As for the normalization argument, I'm not suggesting you store concatenated values. But if you want to join them together for display purposes (like a report or form), I don't think you're violating the rules of normalization.
This is called de-normalizing data. It may be acceptable for final reporting. Apparently some experts believe it's good for something, as seen here.
(Mind you, kevchadder's question is right on.)
Have you looked into the SQL Pivot operation?
Take a look at this link:
http://technet.microsoft.com/en-us/library/ms177410.aspx
Just noticed you're using access. Take a look at this article:
http://www.blueclaw-db.com/accessquerysql/pivot_query.htm
This is nothing you should do in SQL and it's most likely not possible at all.
Merging the rows in your application code shouldn't be too hard.

question on how to use sql server integrated service

I have a table called book with, the attrbutes are booked_id, yearmon, and day_01....day_31. Now i need to unpivot the table and transform day_01...day_31 into rows, I have successed in doing that, but the problem is that my yearmon is a format of 200805 and i need to append a day to it based on day_01 or day_02 etc, so that i can create a new column with date information for example, if it is day_01, it looks like 20080501. Instead of writing huge query, does anyone how to use ssis to tranform it
You should be able to use the Unpivot component and the Derived Column component to do what you need. Look into those and post back if they don't seem to do what you need.