IBM WebSphere / Connections – Performance, Security and the SESN0008E Error


Just found a new issue today that has been vexing me for quite some time, I only found it because of the error [SESN0008E] and the fact that I had to add a whole new WebSphere Nodes to an existing environment so that these errors finally happend with a frequency that I finally noticed it:

logServletError SRVE0293E: [Servlet Error]-[atom-basic]: com.ibm.websphere.servlet.session.UnauthorizedSessionRequestException: SESN0008E: A user authenticated as anonymous has attempted to access a session owned by user:defaultWIMFileBasedRealm/CN=Joe Shmoe,OU=MYOU,dc=corp,dc=company,dc=com

We ran function testing on the new Node and performance was horrendous … I mean really horrendous. Performance has been bad overall before that, but at least tings worked. I investigated a few tech notes, there was some mentioning about the LTPA timeout being too short in combination with several other settings. This is an upgrade, same settings as other systems, should not be an issue … so I looked at other sites/pages here, here and here.

All of the tech notes mentioned [Security Integration] being at the heart of it. I checked all servers and noticed – none of the server that the Connections installer created had this settings set, all servers that I created manually had it. I looked into this a bit more and found out that the [Session Management] – [Security integration] is now a default setting for WebSphere and if you create a server manually it is automatically set. I ran a few third party products in separate servers that were all manually created … they probably brought the overall performance down.

So, I went through all servers that I had created, unset the settings (pic below) and then synced and restarted everything and …. voila, speed restored.

sessionManagement

IBM Connections, Exchange, Kerberos and the Tale of External Non-Collaboration


It is a longer tale, so to make keep it short I decided to busy the lead and give you the synopsis right here:

If you are running IBM Connections integrated with Exchange as your ICMail setup you are using Kerberos. If you want to enable external collaboration by adding another LDAP source for your external users – it will not work.

You can create the repository, add it to WebSphere, you can do all the TDI settings to import the users in it as external users .. but they will not be able to authenticate. The reason is that WebSphere has the authentication mechanism at it’s top level of security (global) and not at the repository level. That means, once you use Kerberos you have to use Kerberos for ALL authentication that happens. Trust me, I have tested. I had PMRs open (with both Connections and WebSphere support). I talked to the IBM Connections Product team and verified that this specific scenario was never actually tested so nobody appears to have known of this, which is also why it never made it’s way into any documentation.

I don’t think there are many clients for whom this might be an issue currently, but I do see many environments wanting more security and wanting to tie in other back-end systems and if that client environment is running AD as their LDAP source , then KERBEROS will be right there as a feature request – or a necessity.

Is External Collaboration Dead when Using Kerberos?

That is an easy answer – No.

But you are now forced to add those external users to your AD forest and either add them to some branch/OU that you can treat as external users or add some AD/LDAP attribute to identify them as external users.

Feature Enhancement Request for WebSphere – PLEASE VOTE!

I entered a feature enhancement request to move the authentication method from a global setting to the repository level – either in general or as art of a security domain setup in WebSphere, thereby allowing non-Kerberos repositories to be used for authentication alongside a KERBEROS enabled repository.

Here is the link to the feature request – the more people look at it, follow it and vote for it the more likely it is to make it’s wat into a future release. you will need to have an IBM website ID to even just look at it but I’d appreciate the effort!

Connections 5.5 – Install Problem for WebSphere Cluster Settings with UNC Shares


I just installed a new Connections V5.5 environment for a new client and came across this issue that I had encountered once before in previous versions when installing the IBM File viewer (look at my presentation from last year at MWLug 2015) .

Scenareo:

  • Connections 5.5,
  • Clustered Windows WebSphere servers (2 nodes on separate Windows server)
  • Windows File Share for shared file services (accessed using a UNC link i.e.: \\[fqhn of server]\[share name])

The Installer will go through and work without a problem, all apps are installed and the clusters in WebSphere created. When you run the WebSphere servers/JVMs for the first time you might notice a new folder created on the same drive as your WebSphere install, the name follows the above UNC naming for the shared file services location. In my case the folder created was [D:\FILESERVER\CnxData\messagestores\xxx).

Messagestores are the way that messaging engines running on WebSphere clustered servers communicate with each other by reading/writing log files (there is much more to it, but let’s keep this lite here …). Both Windows server will create the same folders and you will probably not see a whole lot of errors in the systemout.log files of the WebSphere servers because … those servers can access the files they expect, that they are not getting any inputs from other cluster members is not going to raise any errors inside of WebSphere.

In V5.0 what happens is that the installer creates a WebSphere variable and uses that variable in the cluster settings and then the system works and the UNC drive is read correctly. The V5.5 installer does not do this, it writes the location directly into the sib-engines.xml file of the cluster created and then things fall apart ….

 

What to do:

Basically you have to manually do what the installer should have done:

Create a WebSphere variable

  • I created the same one as V5.0 would have [MESSAGE_STORE_PATH] and gave it the value of the UNC folder location in WINDOWS format (using “\” slashes): i.e. [\\servername\share\messagestores]

Update the sib-engines.xml

  • Search for the sib-engine.cml files  on the Dmgr profile under: ..\WebSphere\AppServer\profiles\Dmgr01\config\cells\[cell name]\clusters\[Cluster Name]
  • Edit the last line in the file for each cluster to look something like this:
<fileStore xmi:id="SIBFilestore_1456105865384" uuid="5976E93BC88E6CB1" logSize="100" minPermanentStoreSize="200" maxPermanentStoreSize="500" minTemporaryStoreSize="200" maxTemporaryStoreSize="500" logDirectory="${MESSAGE_STORE_PATH}/UtilCluster/log" permanentStoreDirectory="${MESSAGE_STORE_PATH}/UtilCluster/store" temporaryStoreDirectory="${MESSAGE_STORE_PATH}/UtilCluster/store"/>

Note the use of “/” in this entry, do it that way!

Do the WAS Thing:

  • You need to then sync the nodes and restart all servers/clusters and then WebSphere will create the folders and subfolders is needs and all will be well ….

 

After a restart you can delete the incorrectly created folders, they do not contain any data you need, the data written into there is transactions and will be re-created when the servers restart.

IBM Connections with Exchange Back-end – Chrome and Kerberos Delegation


First of all, thanks to my new found friend Michele Buccarello who had shared this document earlier last month on some very good pointers about how to integrate Exchange with IBM Connections.  With that document and some guesswork as to encryption settings between WAS and Exchange I was able to solve the problem – 90% of the way. We got it to work with IE and FireFox but Chrome was balking and getting into a log-out cycle. I used Fireshark to take a look and noticed it was an auth.redirect action by the HOMEPAGE app that was followed by a rest API call to Opensocial calendar settings .for my acocunt – and then righ back to the auth.redirect …. a classic redirect loop.
As things were working in FF and IE I knew it was not a system issue but rather a problem localized to Chrome so I looked up some technotes and knowledge base articles and here is how I solved it:
Chrome can be taught to work with Kerberos delegation just as IE and FF. For “normal” SPNEGO it takes it’s settings from IE and will accept them but with Exchange there is delegation going on (if you look at the Connections documentation it has you change two settings for both IE and FF, one of them refers to delegation) and Chrome needs to get a whitelist of which website it accepts delegation tickets from:
Option 1: Command line
Change the command line that starts Chrome to include a command switch:
chrome.exe –auth-negotiate-delegate-whitelist=*
Set the value to either [*] (make sure there are NO QUOTES surrounding the [*] as some documentation in various articles will have you enter it as) or any combination of the actual url you are connecting to i.e.: [*.domain.com] to limit it to anything inside the intranet domain or [connections.domain.com] for only the Connections website itself. Apparently this can also be a comma separated list of entries if that works for you.
Option 2: Create Windows Registry entry
Create this entry: [HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Google\Chrome]
In it create a string entry: [AuthNegotiateDelegateWhitelist]
Any of the values used in the above command line example will work in this registry entry so I suggest to try it above first.
Enjoy – you’re welcome!

SPNEGO: Map SPNs and Create Combined Keytab Files In One Step


I have been wanting to blog about my SPNEGO install guide for a while but have been just a bit busy lately (my usual excuse). However, I just had to help a client setup SPNEGO for their IBM Connections environment so I decided the time for procrastination is over.

 

If you look at the IBM documentation, the process to create the SPNEGO keytab files and mapping the correct URLs and Fully Qualified Hostnames of servers to the AD account is rather onerous. IBM documentation will have you create separate keytab files for each url/FQHN that you want to include in the SPNEGO config and then merge them. For the normal user that is setting up SPNEGO for the fist time that is painful indeed and confusing. My process below does it all in one step (one step per URL/fqhn) and adds all the settings to ONE keytab file. I am usually done in 5 minutes and then create the config file using wsadmin commands and am up and running in SPNEGO in under an hour.

Note: all commands below have to happen ON AN AD DOMAIN CONTROLLER, running them on your workstation will not work.

 

Environment / Variables:

  • SPNEGOAD account: SPNEGOAccount@DOMAIN.COM – domain\SPNEGOAccount
  • Server FQHN: serverfqhn1.example.com, serverfqhn2.example.com, serverfqhn3.example.com, etc.
  • Connections URL (c-record): connections.example.com



Check Current SPN mappings for SPNEGO AD Account:

  • setspn -l SPNEGOAccount
    (review output)


Step 2: Add SPN mapping to SPNEGOAccount
 and create Keytab files

[setspn -s] or [setspn -a] could be used just to add/map the SPNs to the account, but this does not create the keytab files.

  • setspn -s HTTP/servernew.example.com SPNEGOAccount
  • setspn -s HTTP/newsite.example.com SPNEGOAccount

 

Run commands to create a SINGLE keytab file AND map accounts at the same time:

  • ktpass -princ HTTP/servernew.example.com@example.com -ptype KRB5_NT_PRINCIPAL -mapUser SPNEGOAccount -mapOp set -pass password1A -in C:\Temp\KRB\krb5.keytab-out C:\Temp\KRB\krb5.keytab
  • ktpass -princ HTTP/newsite.example.com@example.com -ptype KRB5_NT_PRINCIPAL -mapUser SPNEGOAccount -mapOp add -pass password1A -in C:\Temp\KRB\krb5.keytab -out C:\Temp\KRB\krb5.keytab

 

Note: the first command has the command [set], all the following commands (one for each url/fqhn you want to add) has the command [add]. If you do not use the [add] command, each of your subsequent commands will override your previous one, leaving your AD account with only one fqhn/URL mapped to it. THIS IS IMPORTANT!
Check whether the SPNS are all correct:

  • setspn -l SPNEGOAccount
    (get output and show it has mappings)
  • ldifde -f c:\temp\new-output1.txt -r “(servicePrincipalName=HTTP/ serverfqhn1.example.com)”
  • ldifde -f c:\temp\new-output2.txt -r “(servicePrincipalName=HTTP/connections.example.com)”
    (Get output files and review)

 

 

Some Gotchas

Which  URLs/c-records and server FQHNs to map:

I map EVERYTHING. The main reason is that often your C-record for the site (our example connections.example.com) will point to the fqhn of a server or a load balancing device. In that case you need BOTH of them mapped. I mal all webservers/HIS, WAS servers and (if existing) the LB address (this s usually overkill and not necessary … but paranoia pays off sometimes).

Command errors:

Depending you your AD forest, the above ktpass command might need the AD account your are mapping to either in the [ACCOUNTNAME@DOMAIN.COM] format or [DOMAIN\ACCOUNTNAME] format. You will see the error right away when you run it for the first time.

SPNEGO setting in WebSphere:

If you go by the IBM documentation (there is allot flying around) you will see they generally tell you to add the fqhn of the Deployment Manager as the HOSTNAME in SPNEGO. Keep in mind that works for them because generally they testers tend to work with single server test installs where ALL the systems run on one server and the Dmgr is also the HIS server and often they don’t bother to change the URL for the Connections setup. What you need in there is the C-Record your users will be putting into their browsers to get to Connections in in our example connections.example.com. Should the C-record point to the FQHN of a web server then you could input that address as well. That is why I generally map EVERYTHING, that way you have maximum flexibility should you need to finagle with your architecture and move functionality around.

Oops, you forgot something …

If you suddenly notice you have to add servers to the SPNEGO setup (maybe you are migrating) – DO NOT ADD MORE MAPPINGS TO THE SPNEGO AD ACCOUNT. That will invalidate the existing keytab files and you will have a n SSO outage. To add additional files you have to stop all WebSphere servers involved , add the mappings with the ktpass command using the [ADD] variable and use the existing keytab file from one of your WebSphere servers. Then recreate the config file using wsdmin and replace the old keytab files with the new one.

WebSphere – The Basics on Security, Directories and Federated Repositories


I had promised earlier this year to post more content (other than opinion and news) so I am now catching up on my promise. This post was inspired by a combined WebSphere – IBM Connections review review I did for a client earlier this year, along with some content from my IBM Connections admin training that I offer and that the same client asked me to give after they read my review of their environment. This is the first in a small series of blog-posts on security and configuration in WebSphere, look forward to some more in the next few weeks.

My Shameless Plug: You can get all of this in one big gulp if you hire me for some admin training for your support staff. I also do really kick-ass reviews of IBM Connections environments and performance tuning . . . .

WebSphere – LDAP / Security / Admin rights … the open door policy

I wrote an article on this webpage back in 2012 – WebSphere: wasadmin – how to recover a lost password – that also has something to do with this topic. This posting is in addition to that and will give you some more background info on how WebSphere keeps it’s security info and LDAP settings. If you read below you can find an even easier way to get that info …

XML – The Language of WebSphere

If you have not yet heard about it, here is the story: just about everything (regarding settings and configuration)in WebSphere is XML based. Yes, there are properties files and basic text files but the most files you will be dealing with are all XML files.

This results in Dr Vic’s first two rules:

Rule#1 – Always use a REAL XML editor program – and notepad.exe or wordpad.exe do not count. I personally have two favorites: Notepad++ on Windows and Geany on Linux (or Bluefish Editor – also awesome).

Rule#2 – Never putz (this is a technical term, I swear) in WebSphere XML files without having a back-up of each and every version of your change. If it gets really bad, you will have to re-install WebSphere and loose allot of work.

Shameless plug: I have more rules … hire me to learn more.

Federated Repository

The majority of my clients set up their LDAP settings in WebSphere by going to [Security – Global Security – Federated Repositories] and then never look at it again after that. They don’t really understand what the back-end is – well, here is a crash course:

Federated – The definition:

From late Latin foederatus, based on foedus, foeder- ‘league, covenant.’

Adj. 1. federated – united under a central government. Federate / united – characterized by unity; being or joined into a single entity; “presented a united front”

OK, what does this mean? When you installed WebSphere you were asked about an admin account and a password to assign to it – by default that account is called [wasadmin] though you can change it to anything you want. That user name and password is saved in a FILE BASED directory structure in the Deployment manager and replicated out to all federated nodes. When you add an LDAP directory then the Files based (the thing you see defined as [defaultWIMFilesBasedRealm] are federated meaning that now they are BOTH together part of a SINGLE directory entity that all WebSphere applications will utilize as a single unit for the purpose of user account look-ups and authentication.

The Files Involved:

wimconfig.xml

Is located in the [deployment manage profile]\config\cells\[cellname]\wim\config folder. This file contains the federated directory setting definitions. So the files based directory (more details below) and the LDAP directory/directories are all defined and configured in this file. As this file is an XML file, each directory is defined inside the <config:repositories and the </comfig:repositoes> items.

Let’s look at the example from my training WebSphere environment:

<config:repositories xsi:type=”config:FileRepositoryType” adapterClassName=”com.ibm.ws.wim.adapter.file.was.FileAdapter” id=”InternalFileRepository” supportPaging=”false” messageDigestAlgorithm=”SHA-1″>

<config:baseEntries name=”o=defaultWIMFileBasedRealm”/>

</config:repositories>

<config:repositories xsi:type=”config:LdapRepositoryType” adapterClassName=”com.ibm.ws.wim.adapter.ldap.LdapAdapter” id=”TTrainDom01″ isExtIdUnique=”true” supportAsyncMode=”false” supportExternalName=”false” supportPaging=”false” supportSorting=”false” supportTransactions=”false” supportChangeLog=”none”certificateFilter=”” certificateMapMode=”exactdn” ldapServerType=”DOMINO” translateRDN=”false”>

<config:baseEntries name=””/>

<config:loginProperties>uid</config:loginProperties>

<config:loginProperties>mail</config:loginProperties>

<config:loginProperties>cn</config:loginProperties>

<config:ldapServerConfiguration primaryServerQueryTimeInterval=”15″ returnToPrimaryServer=”true” sslConfiguration=””>

<config:ldapServers authentication=”simple” bindDN=”ldapaccess” bindPassword=”{xor}Dz4sLCgwLTtubWx+”

connectionPool=”false” connectTimeout=”20″ derefAliases=”always” referal=”ignore” sslEnabled=”false”>

<config:connections host=”ldap.intranet.toalsys.com” port=”389″/>

</config:ldapServers>

</config:ldapServerConfiguration>

This shows the two entries I have in my environment:

  • The default file based repository identified by the ID <id=”InternalFileRepository”>
  • My Domino based LDAP repository identified by the ID <id=”TTrainDom01″>

Gotcha #1: User Name and Password is Open

This wimconfig.xml contains the user name and encoded password for the LDAP bind account. Note the choice of words … ENCODED, not ENCRYPTED.

If you want to know the password for my training LDAP account copy the encoded password above and go to this link by Andrew Jones: http://www.poweredbywebsphere.com/decoder.html (thanks Andrew, I send all my clients to your site for further info and learning!)

If his is a production environment I have now gained access to an account in your environment, possibly an account that has update/write rights to the LDAP directory ….. all by looking at one file. If you are like 99.9% of my clients you are compromised:

  1. SECURE YOUR SERVER, LOCK DOWN YOUR FILE SYSTEM
  2. DON’T USE ADMIN OR PERSONAL ACCOUNTS TO BIND TO LDAP
  3. DON’T RE-USE THE SAME PASSWORD FOR ALL YOUR ACCOUNTS

Sound obvious, doesn’t it?

 Gotcha #2: Rogue LDAP entries

If you have ever tried to change an LDAP directory, replace and entry in WebSphere you might have run into the issue that you suddenly can’t log into WebSphere anymore after you made the changes. Why? Well, you need to understand that sometimes when you make changes, those old entries don’t disappear totally – they are left behind and impact you.

Remember the part about FEDERATED above? If not ALL directory entries here (in this file, not what shows in the IBM Console) are accessible and functioning, then the federated directory that you are trying to access will not work and you cannot authenticate. It is the Three Musketeer principle: “All for One, One for All”

Gotcha #3:

Some changes can’t be made in the interface. I had a client that mistakenly entered an LDAP directory as Microsoft AD but it was Domino. They tries to clean it up in this file but it still was not working and they could not log in ….. well, the wimconfig.xml contains allot of directory type specific settings which are set by the type: <ldapServerType=”DOMINO” > .. My advice is to remove the incorrect entry and enter a NEW entry at the same time and then make sure the old incorrect one is gone from the wimconfig.xml. DO NOT manually try to clean this up (other than remove the entry) as you might end up destroying the wimconfig.xml and making your environment unusable.

Remember Dr. Vic’s rule #2 above? Make back-ups before any changes to WebSphere security settings.

IBM Connections: Metrics / Cognos and the HTTP timeout . . . .


Let me preface this post with a statement:

I really, really don’t like Cognos. Metrics is a pain as well.
And .. in case it lacked sufficient emphasis … I really, R E A L L Y don’t like  Cognos.

 

This is an interesting one that I have been battling with for quite some time at my current client. We had been running into errors with Cognos reports not finishing and the only errors we saw were in the syemOut.log files for HTTP sessions suddenly being reset:

 [3/5/13 11:51:05:341 EST] 000000b8 CognosBIReque 3 com.ibm.connections.metrics.reportgeneration.cognos.CognosBIRequestProcessor processCognosBIRequest post jobTemplateSearchPath=/content/folder[@name=’IBMConnectionsMetrics’]/package[@name=’Metrics’]/folder[@name=’static’]/jobDefinition[@name=’jobtemplate5′]

[3/5/13 11:56:05:532 EST] 000000b8 SystemErr R java.net.SocketException: Connection reset

Metrics sends Cognos 5 HTTP requests for each report time range – these correspond to the Jobtemplate1 – Jobtemplate5 reports in Cognos that are called and executed. These HTTP requests are synch calls so they have to stay connected and wait until the Jobtemplate call is finished so metrics can update the process. for all successful calls yo will see HTTP status 200 results and that is exactly what you want. We were seeing the above resets for calls to the Jobtemplate4 and Jobtemplate5 calls – it was KILLING ME.

Metrics was not at fault – it has it’s timeout settings in the metrics-config.xml file (secsPerRequest) and that was set to 3600 so it was off the list of culprits.

We reset the HTTP servers plug-in.xml setting for timeouts (ServerIOTimeout) first to 400 seconds and then to 600 and we saw no change.

We then did a test – we changed the interService href in the LotusConenctions-config.xml file as follows – btw that only works because we have a single Cognos server, not a clustered pair:

sloc:serviceReference bootstrapHost=”” bootstrapPort=”” clusterName=”admin_replace” enabled=”true” serviceName=”cognos” ssl_enabled=”true”> 
<sloc:href>
 
<sloc:hrefPathPrefix>/cognos</sloc:hrefPathPrefix>
 
<sloc:static href=”
http://connect.domain.com” ssl_href=”https://connect.domain.com“/> 
      <sloc:interService href=”https://cognosserverFQHN.domain.com:9443“/> 

Drum-roll ….. Here we go, it fixed the issue, but now the progress display (“xxx% complete”) on the metrics page to be permanently stuck at 0%. What this did do was point ut to the problem …. the F5 load balancer that we in front of the dual HTTP servers. It had a permanent 5 minute http thread timeout set and was killing ANY thread that was going over 5 minutes.

 

The Takeaway:

Metrics/Cognos spawns exactly 110 jobs for each Community metrics update request, many of these requests will go over 5 minutes and you should check that any device/server in your network has a higher HTTP timeout seting.