It's possible create a dataframe with parcial multiIndex columns? - pandas

Something like this:
Note that the first and last field are single columns.

No, once you've got multiple levels in your columns, each column needs a level value for each.
If you're just looking for presentable output, those values can be an empty string. They can also be the same value at each level.
But if you select on the first two columns of the first level, you're going to have four columns at the second level, and pandas will need to know what to call all four.

Related

Forward fill in spark SQL based on column value condition

Please can someone help me how to forward fill values in a case statement based on another column value in SPARK SQL.
I am basically trying to detect outliers in the SQL dataset and so far how I have identified these outliers is identifying standard deviation of a value far from the mean of the dataset.
Now the problem statement is wherever these outliers fall, I have to fill the value in a new column the value which was last valid/authentic.
For example: after 1 in the first column, I want to append 556 in third column and for 3 in the first column, I want to append 561 in the third column
So far, I have identified the outliers and based on the value, I am guessing I can use lag function and go back 1 row. But I also know, this is not a good approach. For example, I get 10 outliers in a sequence, I will have to write 10 CASE statement for that.
Please if someone have any better/efficient approach, please help.

Pandas column classification

Here I am working on a fit bit data set where I have 35 User's Id column and all other activity columns which are of different dates, so now I need to classify all other columns with respective to user Id column in order to perform my analysis can any one help in this
What you're looking for is the fuction groupby(). You can group by one or more columns, and the perform aggregations per group on the rest of the columns
df.groupby(['id', 'date']).agg([...])

Keyerror on pd.merge()

I am trying to merge 2 data-frames ('credit' and 'info') on the column 'id'.
My code for this is:
c.execute('SELECT * FROM "credit"')
credit=c.fetchall()
credit=pd.DataFrame(credit)
c.execute('SELECT * FROM "info"')
info=c.fetchall()
movies_df=pd.DataFrame(info)
movies_df_merge=pd.merge(credit, movies_df, on='id')
Both of the id column types from the tables ('credit' and 'info') integers, but I am unsure of why I keep getting a key error on 'id'.
I have also tried:
movies_df_merge=movies_df.merge(credit, on='id')
The way how you read both DataFrames is not relevant here.
Just print both DataFrames (if the number of records is big, it will
be enough to print(head(df))).
Then look at them. Especially check whether both DataFrames contains
id column. Maybe one of them is ID, whereas another is id?
The upper / lower case of names does matter here.
Check also that id column in both DataFrames is a "normal" column
(not a part of the index).

Not Specifying an Axis

I am trying to write an MDX Code which has all the members of my cube in the row. However when I specify the row, it mentions that I must ALSO specify Columns. But if I do not set an ON ROW nor an ON Column, my code does not validate.
How do I put all my members in a row, and have a single data column be returned?
In mdx ON COLUMNS is axis 0 and ON ROWS is axis 1. For an mdx statement to be valid it must have at least an axis 0 .... this is one of the rules of the language, no getting around it.
Quite often if I need lots of information in my ON ROWS but the columns needs to be a single column then I will select a dimension I am not using e.g. Language, and then use that selected dimensions ALL member for COLUMNS - just to obey the rule. e.g.
SELECT
[Language].[Language].[All] ON 0, //<<JUST A DUMMY ENTRY BUT MUST BE AN ALL MEMBER OF A DIMENSION
...
...

SSRS - How to get column value by column name

I have one table with rows and each row has a column that contains a field name (say raw1 - 'Number001', raw2-'ShortChar003', etc). In order for me to get that value of these fields I have to use a second table; this table has 1 raw with many columns (number001, Number002, ShortChar003, etc).
How can I extract the value?
Good Question..You can use lookup function
=Lookup(Fields!CityColumn.Value, Fields!CityColumn.Value, Fields!CountColumn.Value, "Dataset1")
Or you might have to use string functions..ex LEFT, Substring, Right same like SQL.If possible pls post some data of both tables, I will explain in detail