Discussion:
How to set H264 and aac live frame timestamp ?
g***@vgaic.com
2014-09-16 05:54:27 UTC
Permalink
Dear Sir,

How to set H264 and aac live frame timestamp ?
I use live555 to do rtsp server from my h264/aac live stream.
First, I know every frame about timestamp and frame len from two linux fifo.
And I use ByteStreamFileSource.cpp and ADTSAudioFileSource.cpp to get the frame data.

For h264/aac sync, I use testProgs/testOnDemandRTSPServer.cpp to do:

ServerMediaSession* sms
= ServerMediaSession::createNew(*env, streamName, streamName,
descriptionString);
sms->addSubsession(H264VideoFileServerMediaSubsession
::createNew(*env, inputFileName, reuseFirstSource));
sms->addSubsession(ADTSAudioFileServerMediaSubsession
::createNew(*env, inputFileName3, reuseFirstSource));

Everything is good, but I use vlc to play only 30 minutes£¬ then it's broken.
The vlc debug message is:
avcodec error: more than 5 seconds of late video -> dropping frame (computer too slow ?)
main warning: picture is too late to be displayed (missing 656606 ms)
main warning: picture is too late to be displayed (missing 656602 ms)
main warning: picture is too late to be displayed (missing 656598 ms)
main warning: picture is too late to be displayed (missing 656262 ms)
main warning: picture is too late to be displayed (missing 656298 ms)

I found that the timestamp code in ByteStreamFileSource.cpp is this:

// Set the 'presentation time':
if (fPlayTimePerFrame > 0 && fPreferredFrameSize > 0) {
if (fPresentationTime.tv_sec == 0 && fPresentationTime.tv_usec == 0) {
// This is the first frame, so use the current time:
gettimeofday(&fPresentationTime, NULL);
} else {
// Increment by the play time of the previous data:
unsigned uSeconds = fPresentationTime.tv_usec + fLastPlayTime;
fPresentationTime.tv_sec += uSeconds/1000000;
fPresentationTime.tv_usec = uSeconds%1000000;
}

// Remember the play time of this data:
fLastPlayTime = (fPlayTimePerFrame*fFrameSize)/fPreferredFrameSize;
fDurationInMicroseconds = fLastPlayTime;
} else {
// We don't know a specific play time duration for this data,
// so just record the current time as being the 'presentation time':
gettimeofday(&fPresentationTime, NULL);
}

And I check the timestamp in liveMedia/H264VideoStreamFramer.cpp, the true timestamp is this:

// Note that the presentation time for the next NAL unit will be different:
struct timeval& nextPT = usingSource()->fNextPresentationTime; // alias
nextPT = usingSource()->fPresentationTime;
double nextFraction = nextPT.tv_usec/1000000.0 + 1/usingSource()->fFrameRate;
unsigned nextSecsIncrement = (long)nextFraction;
nextPT.tv_sec += (long)nextSecsIncrement;
nextPT.tv_usec = (long)((nextFraction - nextSecsIncrement)*1000000);

It's use framerate to get timestamp.
So, if I set the ture video timestamp "nextPT.tv_sec and nextPT.tv_usec".

There is anything that I missing ? I found that if I change this, the same problem.
Ross Finlayson
2014-09-16 08:46:29 UTC
Permalink
Post by g***@vgaic.com
And I use ByteStreamFileSource.cpp and ADTSAudioFileSource.cpp to get the frame data.
ServerMediaSession* sms
= ServerMediaSession::createNew(*env, streamName, streamName,
descriptionString);
sms->addSubsession(H264VideoFileServerMediaSubsession
::createNew(*env, inputFileName, reuseFirstSource));
sms->addSubsession(ADTSAudioFileServerMediaSubsession
::createNew(*env, inputFileName3, reuseFirstSource));
Using a byte stream as input works well when you are streaming just a single medium (audio or video). However, if you are streaming both audio and video, and want them properly synchronized, then you *cannot* use byte streams as input (because, as you discovered, you don't get precise presentation times for each frame).

Instead - if you are streaming both audio and video - then each input source must deliver *discrete* frames (i.e., one frame at a time), with each frame being given an presentation time ("fPresentationTime") when it is encoded.

Specifically: You will need to define new subclass(es) of "FramedSource" for your audio and video inputs. You will also need to define new subclasses of "OnDemandServerMediaSubsession" for your audio and video streams. In particular:
- For audio, your subclass will redefine the "createNewStreamSource()" virtual function to create an instance of your new audio source class (that delivers one AAC frame at a time).
- For video, your subclass will redefine the "createNewStreamSource()" virtual function to create an instance of your new video source class (that delivers one H.264 NAL unit at a time - with each H.264 NAL unit *not* having an initial 0x00 0x00 0x00 0x01 'start code). It should then feed this into a "H264VideoStreamDiscreteFramer" (*not* a "H264VideoStreamFramer"). Your implementation of the "createNewRTPSink()" virtual function may be the same as in "H264VideoFileServerMediaSubsession", but you may prefer instead to use one of the alternative forms of "H264VideoRTPSink::createNew()" that takes SPS and PPS NAL units as parameters. (If you do that, then you won't need to insert SPS and PPS NAL units into your input stream.)

Ross Finlayson
Live Networks, Inc.
http://www.live555.com/
Ross Finlayson
2014-09-16 08:46:29 UTC
Permalink
Post by g***@vgaic.com
And I use ByteStreamFileSource.cpp and ADTSAudioFileSource.cpp to get the frame data.
ServerMediaSession* sms
= ServerMediaSession::createNew(*env, streamName, streamName,
descriptionString);
sms->addSubsession(H264VideoFileServerMediaSubsession
::createNew(*env, inputFileName, reuseFirstSource));
sms->addSubsession(ADTSAudioFileServerMediaSubsession
::createNew(*env, inputFileName3, reuseFirstSource));
Using a byte stream as input works well when you are streaming just a single medium (audio or video). However, if you are streaming both audio and video, and want them properly synchronized, then you *cannot* use byte streams as input (because, as you discovered, you don't get precise presentation times for each frame).

Instead - if you are streaming both audio and video - then each input source must deliver *discrete* frames (i.e., one frame at a time), with each frame being given an presentation time ("fPresentationTime") when it is encoded.

Specifically: You will need to define new subclass(es) of "FramedSource" for your audio and video inputs. You will also need to define new subclasses of "OnDemandServerMediaSubsession" for your audio and video streams. In particular:
- For audio, your subclass will redefine the "createNewStreamSource()" virtual function to create an instance of your new audio source class (that delivers one AAC frame at a time).
- For video, your subclass will redefine the "createNewStreamSource()" virtual function to create an instance of your new video source class (that delivers one H.264 NAL unit at a time - with each H.264 NAL unit *not* having an initial 0x00 0x00 0x00 0x01 'start code). It should then feed this into a "H264VideoStreamDiscreteFramer" (*not* a "H264VideoStreamFramer"). Your implementation of the "createNewRTPSink()" virtual function may be the same as in "H264VideoFileServerMediaSubsession", but you may prefer instead to use one of the alternative forms of "H264VideoRTPSink::createNew()" that takes SPS and PPS NAL units as parameters. (If you do that, then you won't need to insert SPS and PPS NAL units into your input stream.)

Ross Finlayson
Live Networks, Inc.
http://www.live555.com/

Loading...