Article Original Creation Date: 2010-12-09
Overview
On production a lot of invalid inform messages like the following are observed:
-
Informs that have a partial soap message
-
Informs that do not have a soap message, only having an HTTP header
-
Informs that contain invalid ASCII characters
These are rather stable at around 1000 informs per hour. But when we look at one ACS server file spanning 15 minutes, with more than average cases, we find the following distribution:
- Incomplete informs: BBOX1=159 , BBOX2=9 , other (old Philips modems)=12
- Illegal char informs: other=60
- Empty soap msg: unknown=27
Some of these modems are not known in Service Gateway as the inform is refused because they are:
- Incomplete - Most probably coming back with a full inform and will then be logged.
- Empty - Most probably coming back with a full inform and will then be logged.
- Illegal - Most probably not been logged in SGW as the illegal characters are stored in there local database.
Questions.
- How can we create reports on these cases, with unique ids, timestamp?
- Why the inform messages might be truncated (after HTTP header or part of the message)?
- Is it possible to strip illegal characters before the inform is processed? (as a patch on 4.1)
Environment
SeviceGateway 4.0.12.0
Solaris 10g
Weblogic 9.2 MP1
Oracle 10i
Root Cause
The client discovered that messages were being truncated by the load balancer in front of the ACS when the load balancer reached the maximum number of allowable simultaneous sessions.
The load balancer would establish a TCP/IP session with the devices and then perform a check to see determine whether or not the maximum number of allowable connections had been reached.
If it had the load balancer would immediately reset the TCP/IP session. In the meantime, the modem had already started to send its inform event across the wire, which was then truncated by the reset.
The ACS was not able to parse the partial message, which resulted in the TR-069 stack of Sagem modems crashing.
Resolution
How can we create reports on these cases, with a unique id, timestamp?
- While inform based session data is typically stored in the SPRT_NC_CWMP_SESSION table, it would not in this case as the inform message would not be parsed at processed correctly by the ACS servers. For informs that are complete and processed correctly the Service Gateway reporting engine can be used to generate reports from data in this table.
- A script could be written to parse the log data on the ACS servers and provide some detail around these partial informs.
Why the inform messages might be truncated (after HTTP header or part of the message)?
- As indicated, the load balancer was allowing the establishment of TCP sessions with the device and then in some cases tearing down the session when it determined that the maximum concurrent session limit had been exceeded. This resulted in the truncated inform messages.
Is it possible to strip illegal characters before the inform is processed? (as a patch on 4.1)
- This functionality was added in Service Gateway 4.1.4 and obsoleted in Service Gateway 4.3 with the addition of the "Event Pre-processing" functionality. Please refer to the respective release notes for additional information.