Building an engine in Pentaho - pentaho

Basically I want to create an engine that reads in all data sources and pushes them out. I'll need to use a flag to turn applications on and off and a sequence column to adjust which is running first. It will be a file based repository.
Does anyone have any useful ideas or links on how I would go about doing this?
I am reading in an excel file with applications as the field.
I have copied rows to result, but i want to output multiple excel files with each application name as the name of the excel file. So each row would become the name of the excel file
Not sure how to do this?

Related

How to Create a Program Which Searches for Values from a .txt or any Text Document in Specific Folders

I am relatively new to programming and want to create a program which can solve a problem that I frequently have.
So here's the background to my short story: I was on a website which hosted many files (We're talking about around 500-1000 small files). I was then like," Oh sweet! I want to have all these things in my hard drive so I know that I have access to them... but am probably not going to use them either way". I proceeded to download all 500-1000 files on that site, but encountered a problem when I looked at the properties of my destination file. Let's say that out of 500 on the site, my computer only had 499 files. Just my luck. I wanted to know what was that one pesky file that slipped right by me and download that file specifically. What I didn't want to do was to delete all the files and then try my luck once more in downloading all the files from the website. On the site, there was no indication of what all files I downloaded, so I was completely in the blue. I could go in Ctrl+C each item, then Ctrl+V into the file manager search bar, but that would be tedious to repeat that 500 times.
Now, what I want to do: I wanted to go ahead and take all of the file names from the website (The file name that I downloaded and the file name that was in my drive are the same), put them all in a simple .txt document or something (The website has multiple unwanted text alongside the text I need, such as:
. If this is not possible to extract the text from the site like this, then I am ok with manually entering the names via copy paste). Then I want the computer to take these values in the document and then search for it in a specific folder path (Note: the actual files are in subfolders within the root folder I want to choose, so the program has to be able to search within multiple folders of the root). Then I want the computer to know if the value in the document, is present as a file. If the file doesn't exist, then I want that value/those values in the document to be displayed as the output. I want this cycle to repeat until all the values have been gone through. The output should list the values that were not present.
Conclusion: You probably now get at what I am trying to do, if you don't, tell me what I need to elaborate on. I really don't care how this program is made (what language or software), I just want something that works... but myself don't know how to create.
Thanks for reading and any response is appreciated!
Dhanwanth P :)
Here's a solution in Python in case you would like to explore...
Similar to what you described, all files from the website are listed in an Excel file 'website_files.xlsx'
And all files are saved in a folder 'downloaded_wav'. The script will work regardless the files are saved in the root directory or sub-folders.
Then I run below Python script to look for the missing file:
import pandas as pd
import os
path_folder = 'C:\\Users\\Admin\\Downloads\\downloaded_wav'
downloaded_files = []
d,m = 0,0
for path_name, subfolders, files in os.walk(path_folder): #include all subfolders
for file in files:
d+=1
downloaded_files.append(file)
df = pd.read_excel('website_files.xlsx')
for file in df.values:
if file not in downloaded_files:
print('MISSING', file)
m+=1
print(len(df), 'files on website')
print(d, 'files downloaded')
print(m, 'missing file(s) found')
Output:
MISSING ['OLIVER_snare_disco_mixready_hybrid.wav']
3 files on website
2 files downloaded
1 missing file(s) found
No worries; I found a solution by myself using Excel (God, it's powerful!).
Basically, I copied and pasted my values from the website, then used a filter to show the values only with .wav. Then I used a Power Query from the folder to get me a list of all names of files in a folder. Finally, I went ahead and compared the two using a formula:
=IF(COUNTIF(B:B,D,"OK","MISSING")
If you need more elaboration, I'd be happy to help, just reply to this. There might be an easier way, but I personally liked the straight-forwardness of this. You only need Microsoft excel!
EDIT:
For me, I used these two videos which go over the power query and countif function:
How to Get the List of File Names in a Folder in Excel (without VBA): https://www.youtube.com/watch?v=OSCPVBWOqwc
How to Compare Two Excel Sheets (and find the differences): https://www.youtube.com/watch?v=8Ou_wfzcKKk
In my case, I made my sheet look like this:

Toad Automation load all various excel files from a folder

I have to load all excel files into a table.
now the issue I am getting is that the file name is always going to be different no matter what and the same goes for the sheet. Is it possible to setup the toad automation to just load the all excel files using the first sheet regardless of its name as it will also be different. that way once it is done I want to move the excel files to a separate location.

Read .pbix file content through C# or java

I am when trying to use java / c# or any other programming language to modify .pbix file which generated from Microsoft Power BI. Is there any dll provided by POWER BI or how can i read the content through program. I just want to get and update the datasource directory. Please help.
Thanks.
I don't think it's possible, and even if it is, the solution is likely inelegant.
Even if you managed to do this, you would need to open your PBIX file in the PBI Desktop to refresh your data.
Are you doing this because you have many queries and it's inconvenient to change data source string (folder name) of all of them? There is a way to keep your connection string in a single variable as described here.
I don't know your exact setup, but looking at your question, lets say you have sets of files in different folders and you want to change the folder in one step.
To use the approach from the link above but with file input, you need to do the following:
If it's a new report, import your files as usual
Create new query: "New Source"->"Blank Query"
You will see "Query1" and an empty text box, enter the folder name, for example "C:\". Rename this query to "Folder".
Go to your imported file in the query editor, "test1" in my example. In query settings on the right, select source.
Change the filename by substituting the folder with your "Folder" query, for example:
...File.Contents("C:\test1.csv"),...
...File.Contents(Folder & "test1.csv"),...
Repeat for all imported files, then "Close & Apply".
Now whenever you need to change the folder with your files, edit your "Folder" value and "Refresh".

Automatically import new csv file data into a "Database" Excel workbook

My situation:
At a competition, we will have 6 "scorers" each using a separate android tablet. For every game (there will probably be 70 or 80 throughout the tournament), each person will score accordingly on a custom app that will create a .csv file. (To be clear, each match will result in 6 separate, 1 row, csv files.) The format of the data will be the same from game to game, and from scorer to scorer. I can have control over the names of these files such as "[Scorer#]_[Match###].csv". These tablets will all be connected to a central computer via USB.
What I would like to do:
I would like to be able to have the data from all of those files automatically populate a "database" table on a single sheet. If possible, I would like a folder to act as a "watch folder" of sorts, where, as a new file shows up in a folder, that data is automatically ingested into the table. If that is not possible, I would be happy with a single function I could run to check for new data after each game ended.
I had considered possibly trying to use power query, but wasn't sure if that could lead me to a usable solution.
Any suggestions would be greatly appreciated!
(and I apologize if anything is unclear. I'm happy to clear up any confusion)
Power Query is a good fit in that scenario. You can set up a query that loads all files in a specific folder and appends the contents. Refresh the query when new files have been added to the folder.
For detailed instructions how to set up such a query, take a look here:
http://excelunplugged.com/2015/02/10/get-data-from-folder-in-power-query/

Excel, vba, and onedrive or sharepoint file sharing

I have two local excel files on my hard drive. Both have Macros to achieve certain goals. But after all being done the end result is that after i click a command button from one excel file (lets call it 'A') the data gets transferred based on a macro behind that command button to the other file (let call this one 'B') in a certain format.
All this works great. The source file 'A' is accessible by everyone to enter data, but the destination file to maintain data integrity is read only but a macro is able to write into it. For obvious reasons, both file are on a shared folder so data can be entered one file and transferred to another by everyone.
Now I want to be able to continue with the same functionality but now on a sharepoint or onedrive. Unfortunately I am unable to do so.
I am not sure what are your exact requirements, however I think you can use SharePoint lists with SPD workflows to meet your requirements instead of using excels.
Whenever business users need data in excel they can always export them from list.
There is one more easy way where you can let users enter data in excel however it will be stored in SharePoint list.