Where is the Sayaka voice in Speech API OneCore? - text-to-speech

Windows 10. I've installed the Japanese TTS voices in the Settings. Now, when I use voice enumeration in Speech API 5.4 OneCore (not in 5.4 proper though), I get 6 voices:
David
Zira
Ayumi
Haruka
Mark
Ichiro
The Speech settings page also shows those 6. But there's clearly a seventh one in the registry, Sayaka (HKLM\SOFTWARE\WOW6432Node\Microsoft\Speech_OneCore\Voices\Tokens\MSTTS_V110_jaJP_SayakaM). Its files are present under C:\windows\Speech_OneCore\Engines\TTS\ja-JP. Compared to the rest, there's an extra file, .heq. Why doesn't it enumerate?
The enumeration code goes:
#import "libid:E6DA930B-BBA5-44DF-AC6F-FE60C1EDDEC8" rename_namespace("SAPI") //v5.4 OneCore
HRESULT hr;
SAPI::ISpVoicePtr v;
v.CreateInstance(__uuidof(SAPI::SpVoice));
SAPI::ISpObjectTokenPtr tok;
hr = v->GetVoice(&tok); //Retrieve the default voice
SAPI::ISpObjectTokenCategoryPtr cat;
hr = tok->GetCategory(&cat); //Retrieve the voices category
SAPI::IEnumSpObjectTokensPtr toks;
hr = cat->EnumTokens(0, 0, &toks);
//And enumerate
unsigned long i, n;
hr = toks->GetCount(&n);
LPWSTR ws;
for (i = 0; i < n; i++)
{
hr = toks->Item(i, &tok);
hr = tok->GetId(&ws);
CoTaskMemFree(ws);
}
The only other mention of Sayaka online that I could find is here
Edit
Enumerating by Reset()/Next() gives the same 6. Trying to create a token directly around the registry path gives error 0x8004503a (SPERR_NOT_FOUND). Doing so while watching with Process Monitor reveals an interesting fact: rather than Sayaka under HKLM, the process interrogates the following key:
HKCU\Software\Microsoft\Speech_OneCore\Isolated\7WUiMB20NMV5Y7TgZ2WJXbUw32iGZQSvSkeaf0AevtQ\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech_OneCore\Voices\Tokens\MSTTS_V110_jaJP_SayakaM
There's indeed a key like that under HKCU, and it contains a copy of HKLM and HKCU settings for SAPI, and there's indeed no Sayaka under Voices in that key. Just the six I've mentioned.
So there's some kind of isolation going on, with SAPI settings in several copies. There are 7 different subkeys under Isolated, and the voice sets are different under those. Two contain voices that have nothing in common with the ones we know, and those have to do with Cortana. Hard to tell what's the unit of isolation - maybe a user, maybe an app package (in the UWP sense).
Edit
Like I suspected, there's an app package based isolation going on. I've created a brand new project with the same code, ran it, and got a different isolation key - F2yLLxINh6S1e3y3MkJo4ilfh036RB_9pHLEVL88yL0. Looks like every time you run a SAPI enabled application, it derives an isolation profile from the current executable. A moment ago, that isolation profile wasn't there, now it is. So it was created by SAPI on the fly. I don't think the voices are hard-coded, so it copied the voices in the isolation profile from somewhere, from the master list.
Where is the master list? It's not HKLM\...\Speech_OneCore, since one can see Sayaka is there. It could be tokens_TTS_ja-JP.xml under C:\Windows\SysWOW64\Speech_OneCore\Common\ja-JP, since Ayumi/Ichiro/Haruka are listed there but Sayaka isn't. The security on that file is quite draconian though, I'm having trouble editing that file even with admin rights. Also, it's a second hardlink to C:\Windows\WinSxS\wow64_microsoft-windows-t..peech-ja-jp-onecore_31bf3856ad364e35_10.0.18362.1_none_46741f8a666da90a.
The SysWOW64\Speech_OneCore folder allows write for administrators, but SysWOW64\Speech_OneCore\Common doesn't. Only TrustedInstaller can write it.
By the way, the isolation logic is specific to OneCore. SetId() in SAPI 5.4 proper looks in the key that matches the provided Id.
Alternative approach: the SAPI 5.4 docs mention the ISpRegDataKey interface, that lets one initialize a token directly from a HKEY. It's not in the typelib though.

This answer is about enabling Sayaka for those SAPI apps that don't explicitly opt in.
The master list of Japanese TTS voices is under C:\Windows\System32\Speech_OneCore\Common\ja-JP. It's not just one file - SAPI enumerates all XMLs there. The problem is, in order to write files to that folder one will need a utility that lets one run programs as TrustedInstaller. Those exist; there's a list here. I've used the one called PowerRun.
You need to create a file called something like tokens_TTS_ja-JP_Sayaka.xml (the exact name doesn't really matter) with the following content:
<?xml version="1.0" encoding="utf-8"?>
<Tokens>
<Category name="Voices" categoryBase="HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech_OneCore">
<Token name="MSTTS_V110_jaJP_SayakaM">
<String name="" value="Microsoft Sayaka - Japanese (Japan)" />
<String name="LangDataPath" value="%windir%\Speech_OneCore\Engines\TTS\ja-JP\MSTTSLocjaJP.dat" />
<String name="VoicePath" value="%windir%\Speech_OneCore\Engines\TTS\ja-JP\M1041Sayaka" />
<String name="411" value="Microsoft Sayaka - Japanese (Japan)" />
<String name="CLSID" value="{179F3D56-1B0B-42B2-A962-59B7EF59FE1B}" />
<Attribute name="Version" value="11.0" />
<Attribute name="Language" value="411" />
<Attribute name="Gender" value="Female" />
<Attribute name="Age" value="Adult" />
<Attribute name="DataVersion" value="11.0.2016.0221" />
<Attribute name="SharedPronunciation" value="" />
<Attribute name="Name" value="Microsoft Sayaka" />
<Attribute name="Vendor" value="Microsoft" />
<Attribute name="SayAsSupport" value="spell=NativeSupported; cardinal=GlobalSupported; ordinal=NativeSupported; date=GlobalSupported; time=GlobalSupported; telephone=NativeSupported; address=NativeSupported; message=NativeSupported; url=NativeSupported; currency=NativeSupported; alphanumeric=NativeSupported" />
<Attribute name="SampleText" value="既定の音声として%1を選びました" />
</Token>
</Category>
</Tokens>
And then copy that file, as TrustedInstaller, to C:\Windows\System32\Speech_OneCore\Common\ja-JP. On 64-bit Windows, also place a copy into C:\Windows\SysWOW64\Speech_OneCore\Common\ja-JP to cover the 32-bit applications.
Then all desktop SAPI applications will get Sayaka too, even the ones that already had an isolated settings key at the moment. It looks like SAPI refreshes the isolated settings from the master list, if necessary.
Sayaka will show up in the voice list under Settings/Speech, too, and say her greeting if asked.

If the isolation registry key doesn't have Sayaka, but HKLM does, an application can copy the Sayaka token to the isolation key on the first run. The key insight here is that the isolation key is writable without elevation, and SAPI supports creating and populating tokens. This doesn't rely on the specifics of isolation. Create a token with a hard-coded ID for Sayaka, and copy the properties and the attributes from HKLM. Like this:
#import "libid:E6DA930B-BBA5-44DF-AC6F-FE60C1EDDEC8" rename_namespace("SAPI") //v5.4 OneCore
//Get the default voice to avoid hard-coding the category
SAPI::ISpVoicePtr v;
SAPI::ISpObjectTokenPtr tok;
v.CreateInstance(__uuidof(SAPI::SpVoice));
v->GetVoice(&tok);
LPWSTR ws;
tok->GetId(&ws);
wchar_t TokID[200];
wcscpy_s(TokID, ws);
CoTaskMemFree(ws);
//Check if Sayaka is already registered in SAPI
SAPI::ISpObjectTokenCategoryPtr cat;
tok->GetCategory(&cat); //The category of voices
SAPI::IEnumSpObjectTokensPtr toks;
cat->EnumTokens(L"name=Microsoft Sayaka", 0, &toks);
unsigned long n;
toks->GetCount(&n);
if (n == 0) //Sayaka is not registered already
{
//Is Sayaka present under HKLM\..\Voices\Tokens?
HKEY hkSayaka, hkAttrs;
if (RegOpenKeyEx(HKEY_LOCAL_MACHINE, L"SOFTWARE\\Microsoft\\Speech_OneCore\\Voices\\Tokens\\MSTTS_V110_jaJP_SayakaM", 0, KEY_READ, &hkSayaka) == ERROR_SUCCESS)
{
if (RegOpenKeyEx(hkSayaka, L"Attributes", 0, KEY_READ, &hkAttrs) == ERROR_SUCCESS)
{
//If yes, create a Sayaka token where SAPI OneCore thinks it should be!
//Replace the final path component of the default voice's ID with Sayaka
LPWSTR pbs = wcsrchr(TokID, L'\\');
wcscpy_s(pbs + 1, _countof(TokID) - (pbs - TokID) - 1, L"MSTTS_V110_jaJP_SayakaM");
tok.CreateInstance(__uuidof(SAPI::SpObjectToken));
//Note the 1 in the third parameter - "create if needed"
HRESULT hr = tok->SetId(0, (LPWSTR)TokID, 1);
DWORD dwi;
wchar_t ValName[100]; //Enough
unsigned char ValData[1000]; //Enough
DWORD ValNameLen, ValDataLen, Type;
//Copy all values from the Sayaka key
//They are all strings
for (dwi = 0; RegEnumValue(hkSayaka, dwi, ValName, &(ValNameLen = _countof(ValName)), 0, &Type, ValData, &(ValDataLen = sizeof(ValData))) == ERROR_SUCCESS; dwi++)
tok->SetStringValue(ValName, (LPWSTR)ValData);
//Copy all attributes from the Sayaka\Attributes key
//All strings too.
SAPI::ISpDataKeyPtr attrs;
tok->CreateKey((LPWSTR)L"Attributes", &attrs);
for (dwi = 0; RegEnumValue(hkAttrs, dwi, ValName, &(ValNameLen = _countof(ValName)), 0, &Type, ValData, &(ValDataLen = sizeof(ValData))) == ERROR_SUCCESS; dwi++)
attrs->SetStringValue(ValName, (LPWSTR)ValData);
RegCloseKey(hkAttrs);
}
RegCloseKey(hkSayaka);
}
}
A similar approach to exposing the hidden TTS voices is described here: https://www.ghacks.net/2018/08/11/unlock-all-windows-10-tts-voices-system-wide-to-get-more-of-them/
Since my original problem was limited to one TTS enabled app, I'm going to accept this answer and no the other one. That said, the whole issue with not inviting Sayaka to the party is probably a Microsoft oversight that they should ultimately address. Feel free to upvote my Feedback Hub request. Windows 10 users only.

Related

How to resolve Checkstyle error: 'method def modifier ' has incorrect indentation

Got a checkstyle error that states a member def modifier has incorrect indentation level 4 and is expected to be level 2.
Apart from having checkstyle as a plugin, you must download the its jar file as well, just because there you will be able to see what google and sun check file are doing to your code, and true to be told, it is kind of hard to understand checkstyle documentation, and having those files at hand will ease the process to understand what is going on.
Getting back to your question, there is a module called Indentation which has a property for basicOffset that is setting the space it waits to find when scanning your code, I'll show you an example:
<module name="TreeWalker">
<module name="Indentation">
<property name="basicOffset" value="2"/>
<property name="caseIndent" value="2"/>
</module>
XML content above shows a simple example, where I want to show you this module thing, that's why I added another property called caseIndent for same module called Indentation which resides inside TreeWalker. As you can see the basicOffset property has the number 2 as a value, then you could say, wait, the message I got said 4 and not 2. I'll explain it:
class Foo { // no space at the left side
private void fooMethod() { // a tab or 2 space at the left side
int a = 0; // 2 space from method's declaration plus 2 for this is 4
}
}
Definively you ought see the xml I mentioned before, replicate in your own with basic stuff and play around to have a better understanding. You can get more info form here.

Domino Mailbox tool hung with NAMELookup2

We have developed the tools to read the emails from the Domino mailboxes and write those into the separate file in local disk(Its look like a backup). Recently we have created a new domino 9 test environment with our lab. But, our tools not working properly with our new domino work environment. To identify the problem about this issue, I have added some debug logs and it seems to look like the control hanged with the function "NAMELookup2". Here, I have added the code snippet,
DHANDLE hLookup;
char *pLookup;
if (NAMELookup2("Local", 0, 1, "$users", 1, dominoUser, 2, "FullName", &hLookup) == NOERROR) // hunged with this line
{
pLookup = (char *) OSLockObject(hLookup);
}
The same tool working fine with our other test environment. So, I think there is no problem with the code. I suspect that maybe the problem with our new work environment setup creation, or maybe missed to provide some kind permission to the users, or maybe I missed to add the mailboxes somewhere, etc.
Note:
I have run the tool with admin privilege user.
It would be great if anyone gives some direction on this.
Thanks,
See this NAMELookup2 page for reference. The function is declared as:
STATUS LNPUBLIC NAMELookup2(const char far *ServerName, DWORD Flags,
WORD NumNameSpaces, const char far *NameSpaces,
WORD NumNames, const char far *Names,
WORD NumItems, const char far *Items,
DHANDLE far *rethBuffer);
where NumItems is the number of null-terminated item names starting at the Items address. The code snippet in your question is passing a single item name ("FullName"), but is setting NumItems to 2. That is clearly wrong and could explain the hang. NumItems should be 1.
I am also suspicious of the ServerName argument. The documentation recommends passing NULL when you want to do a local lookup. Passing "Local" may be another way to accomplish the same, but you need to change your code in any case. I recommend changing the first argument to NULL.

Get USB disk drive letter by device path or handle

My goal is to write a c-dll (compiled with MinGW) that is able to search for certain models of USB sticks connected to the computer and deliver the serial number, the vendor ID, the product ID and the drive letter.
I have searched on the internet for several hours know but could not find an approach that works for me.
I am using the Setup Api to get a list of all connected USB devices. For each USB device I get a path that looks like this:
\?\usb#vid_048d&pid_1172#00000020370220#{a5dcbf10-6530-11d2-901f-00c04fb951ed}
From that string I can get the vendor ID, product ID and the serial number I am looking for.
My problem is now to determine the drive letter of the USB drive that is related to this device path.
During my internet research I found the following approach multiple times (for example here http://oroboro.com/usb-serial-number/):
Once the device path is found, the USB drive must be opened by CreateFile. The handle returned by that function can be used to get the device number by function DeviceIOControl with IOCTL_STORAGE_GET_DEVICE_NUMBER.
After that, the CreateFile function could be used to open each drive letter (starting from a:) and try to get the device number the same way like described above. Once the same device number is found again, the relation between device path and drive letter is made.
My Problem is that the IOCTL_STORAGE_GET_DEVICE_NUMBER call is not working. The DeviceIOControl function returns error code 50 which means "The request is not supported".
I am not able to create a link between the device path of a USB stick and the drive letter. I have tried several IOCTL_STORAGE and IOCTL_VOLUME calls but none worked for the USB sticks I tried.
I also read in another Forum that people had problems with the results of the DeviceIOControl function. It was returning the desired result on some PCs while it was making trouble on others.
Is there another way of achieving my goal?
I already had a look into the registry where I can also find the desired data. But again I had the problem to create the connection between device path and drive letter.
I would not like to use the WMI. I have read that it is still not really supported by MinGW.
I have a implementaion for all this with C# where it is really easy to get the desired information, but now I also need one that is created with unmanaged code and can be used to replace a c-dll also included in Delphi projects.
I would appreciate any suggestions for a solution to my problem.
Best regards,
Florian
And here the code if someone is interested. The position with this comment "//HERE IS WHERE I WOULD LIKE TO GET THE DEVICE NUMBER!!!" is where the request of the device number would be used if it would work.
typedef struct ty_TUSB_Device
{
PSP_DEVICE_INTERFACE_DETAIL_DATA deviceDetailData;
char devicePath[300];
}TUSB_Device;
int
GetUSBDevices (TUSB_Device *devList[], int size)
{
HANDLE hHCDev;
HDEVINFO deviceInfo;
SP_DEVICE_INTERFACE_DATA deviceInfoData;
ULONG index;
ULONG requiredLength;
int devCount = 0;
//SP_DEVINFO_DATA DevInfoData;
// Now iterate over host controllers using the new GUID based interface
//
deviceInfo = SetupDiGetClassDevs((LPGUID)&GUID_DEVINTERFACE_USB_DEVICE,
NULL,
NULL,
(DIGCF_PRESENT | DIGCF_DEVICEINTERFACE));
if (deviceInfo != INVALID_HANDLE_VALUE)
{
deviceInfoData.cbSize = sizeof(SP_DEVICE_INTERFACE_DATA);
for (index=0;
SetupDiEnumDeviceInterfaces(deviceInfo,
0,
(LPGUID)&GUID_DEVINTERFACE_USB_DEVICE,
index,
&deviceInfoData);
index++)
{
SetupDiGetDeviceInterfaceDetail(deviceInfo,
&deviceInfoData,
NULL,
0,
&requiredLength,
NULL);
//allocate memory for pointer to TUSB_Device structure
devList[devCount] = malloc(sizeof(TUSB_Device));
devList[devCount]->deviceDetailData = GlobalAlloc(GPTR, requiredLength);
devList[devCount]->deviceDetailData->cbSize = sizeof(SP_DEVICE_INTERFACE_DETAIL_DATA);
SetupDiGetDeviceInterfaceDetail(deviceInfo,
&deviceInfoData,
devList[devCount]->deviceDetailData,
requiredLength,
&requiredLength,
NULL);
//open the usb device
hHCDev = CreateFile(devList[devCount]->deviceDetailData->DevicePath,
GENERIC_WRITE,
FILE_SHARE_WRITE,
NULL,
OPEN_EXISTING,
0,
NULL);
// If the handle is valid, then we've successfully found a usb device
//
if (hHCDev != INVALID_HANDLE_VALUE)
{
strncpy(devList[devCount]->devicePath, devList[devCount]->deviceDetailData->DevicePath, sizeof(devList[devCount]->devicePath));
//HERE IS WHERE I WOULD LIKE TO GET THE DEVICE NUMBER!!!
CloseHandle(hHCDev);
devCount++;
}
//GlobalFree(devList[devCount]->deviceDetailData);
}
SetupDiDestroyDeviceInfoList(deviceInfo);
}
return devCount;
}
I found out what my problem was. From what I read on the internet it seems there where other people having the same problems like me, so I will post my solution.
The whole point is that there are obviously different path values one can obtain for a USB device using the SetupApi. All path values can be used to get a handle to that device, but there are obviously differences about what can be done with the handle.
My failure was to use GUID_DEVINTERFACE_USB_DEVICE to list the devices. I found out that when I use GUID_DEVINTERFACE_DISK, I get a different path value that lets me request the device number. That way I am able to get the link to the drive letter.
That path value obtained with GUID_DEVINTERFACE_DISK also contains the serial number but not the vendor and product IDs. But since both path values do contain the serial, it is no problem to get them both and build the relation.
I tested the code with Windows XP, 7 and 8 and it works fine. Only the FileCreate code of the code sample above must be adjusted (replace GENERIC_WRITE by 0). Otherwise Administrator rights or compatibility mode are required.
I did not try to find out what these different GUID values really stand for. Someone with a deeper knowledge in this area could probably provide a better explanation.
Best regards,
Florian

JPA with HIBERNATE insert very slow

I am trying to insert some data to SQL Server 2008 R2 by using JAP and HIBERNATE. Everything "works" except for that it's very slow. To insert 20000 rows, it takes about 45 seconds, while a C# script takes about less than 1 second.
Any veteran in this domain can offer some helps? I would appreciate it a lot.
Update: got some great advices from the answers below, but it still doesn't work as expected. Speed is the same.
Here is the updated persistence.xml:
<persistence version="2.0"
xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd">
<persistence-unit name="ClusterPersist"
transaction-type="RESOURCE_LOCAL">
<provider>org.hibernate.ejb.HibernatePersistence</provider>
<class>cluster.data.persist.sqlserver.EventResult</class>
<exclude-unlisted-classes>true</exclude-unlisted-classes>
<properties>
<property name="javax.persistence.jdbc.url"
value="jdbc:sqlserver://MYSERVER:1433;databaseName=MYTABLE" />
<property name="javax.persistence.jdbc.user" value="USER" />
<property name="javax.persistence.jdbc.password" value="PASSWORD" />
<property name="javax.persistence.jdbc.driver"
value="com.microsoft.sqlserver.jdbc.SQLServerDriver" />
<property name="hibernate.show_sql" value="flase" />
<property name="hibernate.hbm2ddl.auto" value="update" />
<property name="hibernate.connection.provider_class"
value="org.hibernate.service.jdbc.connections.internal.C3P0ConnectionProvider" />
<property name="hibernate.c3p0.max_size" value="100" />
<property name="hibernate.c3p0.min_size" value="0" />
<property name="hibernate.c3p0.acquire_increment" value="1" />
<property name="hibernate.c3p0.idle_test_period" value="300" />
<property name="hibernate.c3p0.max_statements" value="0" />
<property name="hibernate.c3p0.timeout" value="100" />
<property name="hibernate.jdbc.batch_size" value="50" />
<property name="hibernate.cache.use_second_level_cache" value="false" />
</properties>
</persistence-unit>
And here is the updated code part:
public static void writeToDB(String filePath) throws IOException {
EntityManager entityManager = entityManagerFactory.createEntityManager();
Session session = (Session) entityManager.getDelegate();
Transaction tx = session.beginTransaction();
int i = 0;
URL filePathUrl = null;
try {
filePathUrl = new URL(filePath);
} catch (MalformedURLException e) {
filePathUrl = (new File(filePath)).toURI().toURL();
}
String line = null;
BufferedReader stream = null;
try {
InputStream in = filePathUrl.openStream();
stream = new BufferedReader(new InputStreamReader(in));
// Read each line in the file
MyRow myRow = new MyRow();
while ((line = stream.readLine()) != null) {
String[] splitted = line.split(",");
int num1 = Integer.valueOf(splitted[1]);
float num2= Float.valueOf(splitted[6]).intValue();
myRow.setNum1(num1);
myRow.setNum2(num2);
session.save(myRow);
if (i % 50 == 0) {
session.flush();
session.clear();
}
i++;
}
tx.commit();
} finally {
if (stream != null)
stream.close();
}
session.close();
}
Updated, here is the source for MyRow:
#Entity
#Table(name="MYTABLE")
public class MyRow {
#Id
#GeneratedValue(strategy=GenerationType.IDENTITY)
private Long id;
#Basic
#Column(name = "Num1")
private int Num1;
#Basic
#Column(name = "Num2")
private float Num2;
public Long getId() {
return id;
}
public void setId(Long id) {
this.id = id;
}
public float getNum1() {
return Num1;
}
public void setNum1(float num1) {
Num1 = num1;
}
public int getNum2() {
return Num2;
}
public void setNum2(int num2) {
Num2 = num2;
}
}
The problem
One of the major performance hits if you use Hibernate as your ORM is the way its "dirty check" is implemented (because without Byte Code Enhancement, which is standard in all JDO based ORMs and some others, dirty checking will always be an inefficient hack).
When flushing, a dirty check needs to be carried out on every object in the session to see if it is "dirty" i.e. one of its attributes has changed since it was loaded from the database. For all "dirty" (changed) objects Hibernate has to generate SQL updates to update the records that represent the dirty objects.
The Hibernate dirty check is notoriously slow on anything but a small number of objects because it needs to perform a "field by field" comparison between objects in memory with a snapshot taken when the object was first loaded from the database. The more objects, say, a HTTP request loads to display a page, then the more dirty checks will be required when commit is called.
Technical details of Hibernate's dirty checking mechanism
You can read more about Hibernate's dirty check mechanism implemented as a "field by field" comparison here:
How does Hibernate detect dirty state of an entity object?
How the problem is solved in other ORMs
A much more efficient mechanism used by some other ORMs is to use an automatically generated "dirty flag" attribute instead of the "field by field" comparison but this has traditionally only been available in ORMs (typically JDO based ORMs) that use and promote byte code enhancement or byte code 'weaving' as it is sometimes called eg., http://datanucleus.org and others
During byte code enhancement, by DataNucleus or any of the other ORMs supporting this feature, each entity class is enhanced to:
add an implicit dirty flag attribute
add the code to each of the setter methods in the class to automatically set the dirty flag when called
Then during a flush, only the dirty flag needs to be checked instead of performing a field by field comparison - which, as you can imagine, is orders of magnitude faster.
Other negative consequences of "field by field" dirty checking
The other innefficiency of the Hibernate dirty checking is the need to keep a snap shot of every loaded object in memory to avoid having to reload and check against the database during dirty checking.
Each object snap shot is a collection of all its fields.
In addition to the performance hit of the Hibernate dirty checking mechanism at flush time, this mechanism also burdens your app with the extra memory consumption and CPU usage associated with instantiating and initializing these snapshots of every single object that is loaded from the database - which can run into the thousands or millions depending on your application.
Hibernate has introduced byte code enhancement to address this but I have worked on many ORM persisted projects (both Hibernate and non Hibernate) and I am yet to see a Hibernate persisted project that uses that feature, possibly due to a number of reasons:
Hibernate has traditionally promoted its "no requirement for byte code enhancement" as a feature when people evaluate ORM technologies
Historical reliability issues with Hibernate's byte code enhancement implementation which is possibly not as mature as ORMs that have used and promoted byte code enhancement from the start
Some people are still scared of using byte code enhancement due to the promotion of an anti 'byte code enhancement' stance and the fear certain groups instilled in people regarding the use of byte code enhancement in the early days of ORMs
These days byte code enhancement is used for many different things - not just persistence. It has almost become mainstream.
To enable JDBC batching you should initialize the property hibernate.jdbc.batch_size to between 10 and 50 (int only)
hibernate.jdbc.batch_size=50
If it's still not as fast as expected, then I'd review the document above paying attention to NOTE(s) and section 4.1. Especially the NOTE that says, "Hibernate disables insert batching at the JDBC level transparently if you use an identity identifier generator."
Old topic but came across this today looking for something else. I had to post on this common problem that is unfortunately not very well understood and documented. For too long, Hibernate's documentation had only that brief note as posted above.
Starting with version 5, there is a better but still thin explanation: https://docs.jboss.org/hibernate/orm/5.3/userguide/html_single/Hibernate_User_Guide.html#identifiers-generators-identity
The problem of slow insert of very large collection is simply poor choice of Id generation strategy:
#Id
#GeneratedValue(strategy=GenerationType.IDENTITY)
When using Identity strategy, what need to be understood is that the database server creates the identity of the row, on the physical insert. Hibernate needs to know the assigned Id to have the object in persisted state, in session. The database generated Id is only known on the insert's response. Hibernate has NO choice but to perform 20000 individual inserts to be able to retrieve the generated Ids. It doesn't work with batch as far as I know, not with Sybase, not with MSSQL. That is why, regardless how hard you tried and with all the batching properties properly configured, Hibernate will do individual inserts.
The only solution that I know and have applied many time is to choose a client side Id generation strategy instead of the popular database side Identity strategy.
I often used:
#Id
#GeneratedValue(strategy = GenerationType.SEQUENCE)
#GenericGenerator(strategy = "org.hibernate.id.enhanced.SequenceStyleGenerator")
There's a bit more configuration to get it to work but that the essence of it. When using a client side Id generation, Hibernate will set the Ids of all the 20000 objects before hitting the database. And with proper batching properties as seen in previous answers, Hibernate will do inserts in batch, as expected.
It is unfortunate that Identity generator so convenient and popular, it appears everywhere in all examples without clear explanation of the consequence of using this strategy. I read many so called "advance" Hibernate books and never seen one so far explaining the consequence of Identity on underlying insert performance on large data set.
Hibernate "default mode" IS slow.
Its advantages are Object Relational Mapping and some cache (but obviously it is not very useful for bulk insertion).
Use batch processing instead http://docs.jboss.org/hibernate/core/4.0/devguide/en-US/html/ch04.html

Change Address/Port of WSDL EndPointAddress at runtime?

So I currently have 2 WSDLs added as Service References in my solution. They look like this in my app.config file (I removed the "bindings" field, because it's uninteresting):
<system.serviceModel>
<client>
<endpoint address="http://localhost:8080/query-service/jse" binding="basicHttpBinding" bindingConfiguration="QueryBinding" contract="QueryService.Query" name="QueryPort" />
<endpoint address="http://localhost:8080/dataimport-service/jse" binding="basicHttpBinding" bindingConfiguration="DataImportBinding" contract="DataService.DataImport" name="DataImportPort" />
</client>
</system.serviceModel>
When I utilize a WSDL, it looks something like this:
using (DataService.DataClient dClient = new DataService.DataClient())
{
DataService.importTask impt = new DataService.importTask();
impt.String_1 = "someData";
DataService.importResponse imptr = dClient.importTask(impt);
}
In the "using" statement, when instantiating the DataClient object, I have 5 constructors available to me. In this scenario, I use the default constructor:
new DataService.DataClient()
which uses the built-in Endpoint Address string, which I assume is pulled from app.config. But I want the user of the application to have the option to change this value.
1) What's the best/easiest way of programatically obtaining this string?
2) Then, once I've allowed the user to edit and test the value, where should I store it?
I'd prefer having it be stored in a place (like app.config or equivalent) so that there is no need for checking whether the value exists or not and whether I should be using an alternate constructor. (Looking to keep my code tight, ya know?)
Any ideas? Suggestions?
EDIT
Maybe I should ask about these Alternate constructors as well.
For example, one of them looks like this:
new DataService.DataClient(string endPointConfigurationName,
string remoteAddress)
What values could get passed for "endPointConfigurationName" and "remoteAddress"?
EDIT2
Answering my own questions here, the "endPointConfigurationName" appears to be the same as the "name" in the app.config XML and the "remoteAddress" is formatted the same as "endpoint address" in the app.config XML.
Also! The answer to my first question about getting the EndPointAddresses is the following:
ClientSection clSection =
ConfigurationManager.GetSection("system.serviceModel/client") as ClientSection;
ChannelEndpointElementCollection endpointCollection =
clSection.ElementInformation.Properties[string.Empty].Value as ChannelEndpointElementCollection;
Dictionary<string, string> nameAddressDictionary =
new Dictionary<string, string>();
foreach (ChannelEndpointElement endpointElement in endpointCollection)
{
nameAddressDictionary.Add(endpointElement.Name,
endpointElement.Address.ToString());
}
EDIT3
Ok, I think I've figured out the 2nd half (and thus, full solution) to my problem. I found this on another website and I modified it to meet my needs:
Configuration configuration;
ServiceModelSectionGroup serviceModelSectionGroup;
ClientSection clientSection;
configuration =
ConfigurationManager.OpenExeConfiguration(ConfigurationUserLevel.None);
serviceModelSectionGroup =
ServiceModelSectionGroup.GetSectionGroup(configuration);
clientSection = serviceModelSectionGroup.Client;
foreach (ChannelEndpointElement endPt in clientSection.Endpoints)
{
MessageBox.Show(endPt.Name + " = " + endPt.Address);
}
configuration.Save();
With this code, we have access to the clientSection.Endpoints and can access and change all the member properties, like "Address". And then when we're done changing them, we can do configuration.Save() and all the values get written out to a user file.
Now here's the catch. In debug mode, the "configuration.save()" does not appear to actually persist your values from execution to execution, but when running the application normal (outside of debug mode), the values persist. (Which is good.) So that's the only caveat.
EDIT4
There is another caveat. The changes made to the WSDLs do not take effect during runtime. The application needs to be restarted to re-read the user config file values into memory (apparently.)
The only other thing that I might be interested in is finding a way (once the values have been changed) to revert the values to their defaults. Sure, you can probably delete the user file, but that deletes all of the custom settings.
Any ideas?
EDIT5
I'm thinking Dependency Injection might be perfect here, but need to research it more...
EDIT 6
I don't have comment privileges but you need to run
ConfigurationManager.RefreshSection("client");
to have the cache updated so that changes happen immediately.
If you're using Microsoft Add Web Reference to create your service reference, then I think you may have trouble changing the connection programmatically. Even if you did change the auto generated code, as soon as you did an Update Service Reference it'd be overwritten.
You're best bet is to scratch Microsoft's auto generated code and build your own WCF classes. It's not difficult, and offers lots of flexibility / scalability.
Here's an excellent article on this very subject.
As for storing the custom addresses, it would depend on your app whether it's a Silverlight, Windows or web app. My personal choice is the database.