T42 | <b100dian> mal: updates (not much different): the else branch is used on my 18.1 port (especially at end of playback). This is browser playback | 19:32 |
---|---|---|
T42 | <b100dian> for reordering the `m_cb.buffers_released(m_data);` call to be last even with all the method inside the locker, I still see random crashes in browser playback, especially when seeking/changing resolution on youtube | 19:34 |
T42 | <b100dian> The crashes are with GECKO_CAMERA_DROID_FORCE_MEDIA_BUFFER=1 which is the workflow I made work | 19:34 |
T42 | <b100dian> Today I didn't see a crash that points to the cause (other thank corrupted stack like https://pastebin.ubuntu.com/p/qccXMGvPdF/) but last month when I was developing this I got lucky and I remember a specific crash leading me to think that the order is the key. I don't have it as we speak.. | 19:36 |
mal | @b100dian very odd how moving that line could cause crashes | 19:47 |
T42 | <b100dian> My hypthesis is that the call to release buffers may touch the memory of the buffers that was just freed above. Usually a use after free gets unnoticed, that's why it gets by silently. But I am open to other explanations | 19:48 |
mal | @b100dian that call goes to gst-droid and gecko-camera also | 19:59 |
mal | https://github.com/sailfishos/gecko-camera/blob/master/plugins/droid/droid-codec.cpp#L671 and https://github.com/sailfishos/gst-droid/blob/master/gst/droidcodec/gstdroidvdec.c#L375 | 19:59 |
T42 | <b100dian> Agreed, that's what I refer to as "clients". I can only test with gecko-camera on my lineage-18.1 port, I still have to figure out what is wrong with gst-droid. About the buffers_released in gecko-camera, hang on | 20:02 |
T42 | <b100dian> it calls clear on m_bufferPool which is calling the destructors on DroidGraphicBufferPool::Item which call droid_media_buffer_destroy | 20:03 |
mal | @b100dian about the crash in media_buffer_destroy you mention in the comment, is that from decStrong? | 20:18 |
T42 | <b100dian> The crash I mention in the comment is the same use after free as above. decStrong is just one method you cannot call on a freed pointer. | 20:20 |
mal | @b100dian just wondering what the ref count of the slot.droidBuffer is in your case, and what would doing decStrong for the slot.droidBuffer in frameAvailable do in your case | 20:24 |
T42 | <b100dian> Oh, that I think what I was tryig to avoid by tracking out-of-slot buffers. I cannot decStrong them again. And if I don't decStrong them when they are replaced in their slots, it's out of buffers for playback. Do you mean I should try to unconditionally decStrong them? Maybe I should look up a refCount getter for the buffers if so | 20:26 |
mal | maybe you could print the slot.droidBuffer.m_refCount in that if you have (m_refCount is public member of the buffer) | 20:27 |
T42 | <b100dian> They're of MediaBufferBase type I think.. | 20:28 |
mal | DroidMediaBufferSlot has android::sp<DroidMediaBuffer> droidBuffer; and DroidMediaBuffer has public member m_refCount so slot.droidBuffer->m_refCount should work? | 20:29 |
mal | or is there another ref count for those? | 20:31 |
T42 | <b100dian> Oh, I think I got lost into multiple DroidMediaBuffer defines, there's one in AsyncCodecSource which is probably camera, and _DroidMediaBuffer indeed has a m_refCount I guess, let me try | 20:32 |
mal | but that incStrong/decStrong is doing different thing | 20:32 |
T42 | <b100dian> It's an ANativeWindowBuffer thing I guess? | 20:33 |
mal | a bit confusing why there are those incRef and decRef is that DroidMediaBuffer | 20:36 |
mal | @b100dian maybe add debug prints to the incRef and decRef to see if those are called | 20:38 |
mal | those are assigned in the DroidMediaBuffer constructor | 20:40 |
T42 | <b100dian> Yeah, I just saw they're static and they get assigned to some "common" which is probably a data member of the ANativeWindowBuffer, looking for its header | 20:43 |
mal | yes | 20:43 |
T42 | <b100dian> Are you trying to.. well, get rid of the unslotted memory stuff so we can just decRef only if needed? | 20:43 |
mal | yes, that would be clear if possible | 20:44 |
T42 | <b100dian> I see. Because that has the same problem I tried to solve with what I deemed "pointer tracking" - you can't get the info if it is there or not if you don't have it in memory to inspect.. So you still need something to dereference like m_buffer->decRefIfNeeded() but you can't do that _and_ free the memory too | 20:45 |
T42 | <b100dian> The best way would be to have a cb thatvsays releaseBuffer, singular, implemented in clients | 20:48 |
mal | if you have time please check if those incRef and decRef are used | 20:50 |
mal | maybe I could also test on my device | 20:50 |
T42 | <b100dian> Yes, I will do that too. Unfortunately I cannot provide a solution like "releaseBuffer", singular until I figure out what's wrong with gst-droid on this port.. | 20:53 |
mal | wht | 20:55 |
mal | what was the issue with that? | 20:55 |
T42 | <b100dian> Long story. So video playback was not working here, browser was many frames then halt, gallery little frames then halt. I have figured out this issue in usage of droidmedia by gecko-camera, that buffers do not get recycled. I have not yet figured out what is the issue with gst-droid, it is an issue as we speak. | 20:57 |
T42 | <b100dian> This was the debugging of gst-droid I did last time https://piggz.co.uk/sailfishos-porters-archive/index.php?log=2023-07-30.txt#line3 | 21:00 |
mal | what kind of hardware is the device? qcom? mediatek? | 21:02 |
mal | you have to force media buffers with that GECKO_CAMERA_DROID_FORCE_MEDIA_BUFFER=1 ? | 21:04 |
T42 | <b100dian> Asus Zenfone 8 - QCom. But /proc/cpuinfo does not reaveal it :) | 21:04 |
T42 | <b100dian> So I dont' have to force it. But I did try without, and there was a crash converting QOMX_COLOR_FormatYUV420PackedSemiPlanar32m (logs from 2023-07-16 I think) | 21:05 |
T42 | <b100dian> (which was probably because the 32m layout was not contiguous, but I didn't dig that fully since I made this media buffers discovery) | 21:11 |
T42 | <b100dian> mal: incRef and decRef are used | 21:12 |
T42 | <b100dian> So I could probably use m_refCount but I would still need them to be.. allocated by that time. Let's continue tomorrow, I'm appreciating your input on this | 21:34 |
Generated by irclog2html.py 2.17.1 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!