How can I remove duplicates in PowerPivot? - powerpivot

I have a PowerPivot opened in Excel that's loading a csv file.
I need to remove duplicates in a column but there is no query edit.
What's the best way to do this?
The column in question contains text values like addresses

There are two options, you can use power query (part of excel 2016, addonn in excel 2013) to clean the data up before it is loaded into power pivot. Note that if you are using power query it is case sensitive so it will consider ABC <> Abc.
It does have functions to force strings to be all upper case, lower case or Capital for the first word.
The option is to setup the connection in Power Pivot, then go to the loaded table in the model. In the ribbon do design --> Table properties which has options for editing the SQL statement which is used to generate the table.

Related

Complex filter in Excel using AND/OR Nesting

Is it possible to create a filter in Excel across multiple columns with AND/ORs, or what's the suggested way to do this? For example:
WHERE
(Name = "Tom" AND country in ['US', 'CA'])
OR
(Age > 14 and country not in ['FR', 'DE'])
If not, how does one normally do a SQL-like filter clause in Excel, or is this considered too much of an edge case for Excel to consider?
You may try:
=OR(AND(Name="Tom", OR(country="US", country="CA")),
AND(Age > 14, AND(country<>"FR", country<>"DE")))
The trick is that using AND/OR with Excel requires nesting the logic, which is different than how SQL does it.
You may replace Name and country with the actual Excel cells which contain that particular data.
In Excel, AND and OR are functions. See Excel / Formulas and functions.
Example:
=OR(AND(B2="Tom", OR(B3="US", B3="CA")), AND(B4>14, B3<>"FR", B3<>"DE")))
With the new Dynamic Array formulas (available in most versions of Office 365), the list can be filtered with the Filter() function and the result is displayed outside of the list.
The screenshot shows Tim Biegeleisen's formula with conditional formatting to highlight the TRUE values before filtering the table on that column. Next to that is the result of ONE formula in cell H2, which has automatically spilled to the right and down. For ease of identification, I have included an index column.
Note that AND logic is built with the multiplication operator, whereas OR logic uses the addition operator.
=FILTER(Table1,
((Table1[Name]="Tom")*((Table1[Country]="US")+(Table1[Country]="CA")))+
((Table1[Age]>14)*(Table1[Country]<>"FR")*(Table1[Country]<>"DE")))

Dynamic Parameter in Power Pivot Query

We are using Excel 2013 and Power Pivot to build modules that consist of several Pivot tables that are all pulling data from the same Power Pivot table, which queries our T-SQL data warehouse.
In an effort to simplify and fully automate this module, we wanted to create a text field that would allow a user to enter a value (a client ID# for example), and then have that value be used as a parameter in the Power Pivot query.
Is it possible to pass a Parameter in the Power Pivot query, which is housed in a text field outside of the query?
You can also pass a slicer or combobox selection to a cell. Define a name for that cell. Put that cell (and others if you have multiple text variables to use) in a table. For convenience, I usually name this table "Parameters". You can then 'read in' the parameters to your query and drop them in your query statements.
The code at the top of your query to read these parameters in might look like...
let
Parameter_Table = Excel.CurrentWorkbook(){[Name="Parameter"]}[Content],
XXX_Value = Parameter_Table{1}[Value],
YYY_Value = Parameter_Table{2}[Value],
ZZZ_Value = Parameter_Table{3}[Value],
Followed by your query wherein instead of searching for, say a manually typed in customer called "BigDataCo", you would replace "BigDataCo" with XXX_Value.
Refreshing the link each time a different customer is selected will indeed be a very slow approach, but this has worked for me.
Rather than pass a parameter to the data source SQL query, why not utilize a pivot table filter or slicer to do allow the users to dynamically filter the data? This is much faster than refreshing the data from the source.
If for some reason you need to pass this directly to the source query, you'll have to do some VBA work.

sorting BIG excel data

Right so, I have been given a LOT of "consumer data" to sort, 3 excel files, each containing multiple worksheets, up to 7 worksheets, each up to 1M rows (max worksheet size in excel 2013 is just over 1 rows)
I need to pull out of these all people within a region, so have a list of post codes in this region (say 30 post code areas)
How can I achieve this most easily?
If the data was in SQL server, i'd just write a long SQL statement selecting all where postcode LIKE 'B75'% OR 'B74'% etc etc.
But in excel I can only run a "filer" on one worksheet at a time... (I think)
Is it going to be easiest to throw all the data into sql server, or have I overlooked a method?
First solution is to let Excel do the task of sorting for you. You need to add filters to columns and select Sorting options.
Other solution is to Export Data to SQL Table(s). To do this, Open SQL Management Studio, Right Click on Database to which you want to export data to and Select "From Excel File". Do this for each Excel file you have. After importing all data in Database, sort Data using SQL Query.
Second Solution is reliable but first solution is faster. You need to decide which one you should select.

collecting SQL statements using Excel

in my everyday work, I am receiving data in Excel spreadsheets, which I need to insert into relational database.
To accomplish this, I prepare formulas which generate "insert" statement (I am using both insert and select statement for example to choose ID of all elements with specific label).
Because those spreadsheets are complex, they contain SQL commands in more than one column.
This is the point where problems begin - I cannot simply select all cells, copy them and paste into SQL Server (it will concatenate information from cells in the same row).
In most cases I'm preparing additional sheet where I'm collecting all statements in one column
(using simply formula which rewrite text from other cells). Unfortunately preparing such sheet is time consuming and might causing an error (for example if I forgot about column or I add rows).
Is there any more convenient way to do it?
I thought about writing a macro which collect all values from selected range.
Is it good idea or can I use something better?
You can do all that using VBA.
You know what are the rules so you have the business logic in your head. Now, just type the code to do it :)
If you want you can do the insert in the Excel using something like this.

Dynamic SSRS report

I had a problem in creating the Dynamic report in SSRS. My problem is:
In a table I have stored SQL scripts with the column SQLScripts. If you execute these SQL scripts you get different number of columns for each script.
My problem is, I have one report with buttons of these scripts, for example test1, test2...like that. If you press test1 button this should take the test one SQL script and should display the report with appropiate columns in that sqlscripts.
I can't create individual reports for each test report, they are plenty. Are there any options for me to solve this problem...
The only way I've been able to get this to work sofar is:
Each report has 2 datasets.
ReportData
DataHeaders
The "DataHeaders" need to have the proper name of the datafields in "ReportData". Be careful since SSRS replaces blanks and special characters with "_"
Now, create a table (or matrix) and drag the DataHeaders as the Columns of your report. (This should be a grouped column). If you run it at this point, you'll see all your columns without any data. Now comes the magic:
Create another report that takes a "DataField" parameter. Create another table or matrix within this report and set it's dataset property to be "ReportData". In the DATA cell for the table, set it to the expression =Fields(Parameters!DataField.Value).Value
Now go back to your first report. Right click and insert a subreport. Right click on the subreport and select "Subreport Properties". Under general, select the second report you created to be used as the subreport. Under parameters, select the DataField parameter and set its value to something like =Fields!DataField.Value
In my case I did some formatting in this expression to fix the above mentioned issue with spaces and special characters, since my stored procedure was initially used in ASP.NET and this was just a proof of concept.
Also in my experience the performance isn't great. In fact it was kinda slow, though I haven't had a chance to switch it to use a shared dataset, which I suspect would help a bit. Please let me know if you find a better solution.
I have not found a way to do this completely dynamically. Here is a similar question with some possible solutions:
How do i represent an unknown number of columns in SSRS?
You basically need to create a 'master dataset' from the other Datasets that are based on your multitude of SQL scripts first.The master dataset should contain the data to be presented in it's most simplistic form, i.e. in a simple list format.
Finally, go to the toolbar in SSRS and drag a 'Matrix' into the report. A Matrix table acts similar to a pivot table in Excel or a CrossTab query in Access that will display whatever's in the Dataset.