Opened on 07/21/15 at 17:39:16
Last modified on 01/25/18 at 05:38:33
#2375 new defect
tftpDriver free's current_directory
Reported by: | Michael Davidsaver | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | 4.10.3 |
Component: | network/legacy | Version: | 4.10 |
Severity: | normal | Keywords: | |
Cc: | Angus Gratton | Blocked By: | |
Blocking: |
Description
It seems I've re-discovered this issue reported by Angus Gratton in 2010.
https://lists.rtems.org/pipermail/users/2010-July/022453.html
I can confirm Angus' diagnosis. In rtems_tftp_eval_path(),
'pathloc->node_access' defaults to
'rtems_current_user_env->current_directory'. In some cases
'->node_access' is replaced with a newly allocated string, in the others
it is not. Unfortunately rtems_tftp_free_node_info() happily free()s
unless cwd==ROOT_NODE_ACCESS(fs).
I chased this down with GDB and confirmed that, in my case, when
cwd=='/epics/BOOTP_HOST/epics/myhost' the current directory string is
free'd when open() fails to open a file, which results in:
Program heap: free of bad pointer 358CE4 -- range 2A1C10 - 7EE0000
when the following open() does the same.
Attachments (4)
Change History (28)
comment:1 Changed on 07/21/15 at 17:40:31 by Michael Davidsaver
comment:2 follow-up: 3 Changed on 07/26/15 at 16:49:36 by Michael Davidsaver
I've created a test program to demonstrate the problem. It can be run in QEMU with the pc386 BSP.
https://github.com/mdavidsaver/rtems-tftp-test
I've also made an attempt to backport the changes to tftpDriver.c from the master/4.11 branches. However, this was not enough to fix the issue. With an additional fix the test program passes.
https://github.com/mdavidsaver/rtems/commits/4.10-tftp-fix
The 4 revisions on this branch are:
apply changes through master
git diff 4.10.2..eb7753437ff858ebe34a08baef7dfdb45eb0f018 -- cpukit/libnetworking/lib/tftpDriver.c | git apply
The was later rebase'd onto the 4.10 branch head.
backport fixes
Undo some post 4.10 changes to make it build.
Use ->node_access_2 to hold the path string owned by location_info struct.
So node_access_2==NULL implies that node_access is a borrowed reference.
This is used in eval_path to ensure that all node_access for the FS root
share the same string (otherwise unmount() complains).
close() false error
It seems that close() returns -1 on success (errno==0). A bug, but unrelated to my original report.
don't free directory node's path string
Use of realloc() in eval_path still allows several location_info s to unwittingly share the same string. Since the length of the string isn't tracked, the parent node's path is silently changed. This patch fixes the reported issue.
comment:3 follow-up: 4 Changed on 07/27/15 at 14:08:02 by Gedare Bloom
Sebastian or Chris should review this fix. Pulling in all the changes since 4.11 includes many extraneous modifications. Is there an EPICS maintainer who might be able to comment?
Can you pull out the close() false error and put it into a separate ticket? It looks like this may affect 4.11 too.
It might be nice to put the rtems-tftp-test into the testsuite with some reformatting to fit the test patterns. (Might make a suitable set of Google Code-In tasks if we have that this year.)
comment:4 follow-up: 5 Changed on 07/27/15 at 19:23:09 by Michael Davidsaver
Replying to gedare:
... Pulling in all the changes since 4.11 includes many extraneous modifications.
Agreed, though I think most of the extraneous changes are reverted in the 2nd revision.
Is there an EPICS maintainer who might be able to comment?
On?
Can you pull out the close() false error and put it into a separate ticket?
It might be nice to put the rtems-tftp-test into the testsuite with some reformatting to fit the test patterns. (Might make a suitable set of Google Code-In tasks if we have that this year.)
Suggestions on how to structure this would be welcome. I wasn't sure how to do this since the test depends on files being present on the tftp server (and on a BSP with support for networking).
comment:5 follow-up: 8 Changed on 07/27/15 at 19:56:41 by Gedare Bloom
Replying to mdavidsaver:
Replying to gedare:
... Pulling in all the changes since 4.11 includes many extraneous modifications.
Agreed, though I think most of the extraneous changes are reverted in the 2nd revision.
Is there an EPICS maintainer who might be able to comment?
On?
The value of this fix.
Can you pull out the close() false error and put it into a separate ticket?
Thanks
It might be nice to put the rtems-tftp-test into the testsuite with some reformatting to fit the test patterns. (Might make a suitable set of Google Code-In tasks if we have that this year.)
Suggestions on how to structure this would be welcome. I wasn't sure how to do this since the test depends on files being present on the tftp server (and on a BSP with support for networking).
Ah, yes good point, actually running this test is problematic for most targets. The way we get around that in 4.11+ is to add the test to the list of exempted tests in a file in each BSP that gets read by the automated test runner. Thus, if a test only works on one BSP, it would be black-listed on all the other BSPs. This test would have additional constraints though (networking, connected server) that make it harder to fit in the current testing framework.
For now, it might be best to put the test into the network-demos repository. The existing Makefile should just about work there, and someone can do the 'waf' part easily enough.
comment:6 follow-up: 7 Changed on 07/27/15 at 19:57:17 by Gedare Bloom
P.S. perhaps you can use git-rebase
to squash the first two commits together? This would make reviewing the relevant changes much easier for us.
comment:7 Changed on 07/28/15 at 00:41:31 by Michael Davidsaver
Replying to gedare:
P.S. perhaps you can use
git-rebase
to squash the first two commits together? This would make reviewing the relevant changes much easier for us.
Sure could :) Easier to squash than to un-squash though. For review you might try:
https://github.com/RTEMS/rtems/compare/4.10...mdavidsaver:4.10-tftp-fix?w=1
or manually:
git diff --ignore-all-space 0f3388d1f3e98f70ebc2f15d015680747b795328..7f9003eff1c6b0df11c515fa2338124fbce845dd
Should I do a rebase?
comment:8 Changed on 07/28/15 at 01:07:43 by Michael Davidsaver
Replying to gedare:
Replying to mdavidsaver:
...
The appropriateness of this fix to their users. (Which is to say, maybe someone else who is impacted by the bug or the fix might be available to review the code?)
At present TFTP support is a compile time conditional choice between TFTP or NFS, with the default being NFS. IMO the age of this bug is an indication of how many sites change this as it breaks ~100% of the time. I'd like to make this a runtime choice.
My motivation for this actually relates to the second part of your reply. For the past year or so we've been working to increase the unit-test coverage in EPICS Base. Part of this was setting up continuous integration (Jenkins) to automatically build and run these tests.
This works great for Linux, Windows, and similar OSs, but not for RTEMS or vxWorks. We have tests for these targets as well, but it's a pain to run them, so it isn't often done. This led to some last minute pain in recent release cycles.
I've been working to automatically run these tests with RTEMS in QEMU, which could be done by a CI server. It's working great against RTEMS 4.9.6 and helped me find a bug in some other code I'm working on.
https://github.com/mdavidsaver/epics-base/tree/multiboot
(Don't get too excited about the branch name, it should have been something like "bspcmdline")
...
For now, it might be best to put the test into the network-demos repository. The existing Makefile should just about work there, and someone can do the 'waf' part easily enough.
Fine by me.
comment:9 Changed on 07/29/15 at 07:16:43 by Sebastian Huber
In case a certain feature/behaviour/API is important for you, then you should add test cases to the RTEMS test suite. For example the FTP client and server is tested like this:
https://git.rtems.org/rtems/tree/testsuites/libtests/ftp01/init.c
For TFTP you just have to add a simple server to the test or cpukit in order to get a self contained test.
comment:10 Changed on 07/30/15 at 22:47:09 by Joel Sherrill
Can the proper patches to RTEMS and network-demos be attached to this ticket?
I would like to get this resolved for 4.11 and master.
Changed on 08/01/15 at 13:23:57 by Michael Davidsaver
Attachment: | 0001-tftpDriver-apply-changes-through-master.patch added |
---|
apply changes through master
Changed on 08/01/15 at 13:24:15 by Michael Davidsaver
Attachment: | 0002-tftpDriver-backport-fixes.patch added |
---|
backport fixes
Changed on 08/01/15 at 13:24:30 by Michael Davidsaver
Attachment: | 0003-tftpDriver-don-t-free-directory-node-s-path-string.patch added |
---|
don't free directory node's path string
comment:11 Changed on 11/20/16 at 13:49:11 by Michael Davidsaver
I've seen mention of 4.10.3-rc*. It would be quite nice to get this fix included in 4.10.3.
comment:12 Changed on 01/25/17 at 13:41:58 by Sebastian Huber
The patches are quite confusing. Would you mind sending patches via "git format-patch" and "git send-email" to the devel@… mailing list for RTEMS 4.10 and 4.11?
comment:13 Changed on 01/25/17 at 14:53:32 by Michael Davidsaver
umm... these patch files were produced with git-format-patch. I will post them to the devel list, however I doubt that this will make them less "confusing".
comment:14 Changed on 01/25/17 at 14:57:00 by Sebastian Huber
Is 0001-tftpDriver-apply-changes-through-master.patch relevant for 4.10? The file system layer was completely reworked during 4.10 and 4.11.
comment:15 Changed on 01/25/17 at 15:25:25 by Michael Davidsaver
These patches apply against the 4.10 branch as of 23 Feb 2015 (0f3388d1f3e98f70ebc2f15d015680747b795328). So yes it is relevant.
comment:16 Changed on 01/26/17 at 06:42:34 by Michael Davidsaver <mdavidsaver@…>
comment:17 Changed on 01/26/17 at 06:44:16 by Sebastian Huber
Ok, I checked in these three patches on the 4.10 branch.
[aecb508673df0d047e055f452e3f2d9db17328c0/rtems]
Please check that this is what you want.
Do we need changes on the 4.11 and master branches?
comment:18 Changed on 01/26/17 at 15:00:20 by Michael Davidsaver
Do we need changes on the 4.11 and master branches?
Unknown. I only tested 4.9 and 4.10 branches. Running my test case should answer this question, as well as confirming the fix on 4.10. I'm also curious if someone other than myself can run this.
comment:19 Changed on 01/27/17 at 06:27:39 by Sebastian Huber
I don't have time to test this. It would be good to fix this for 4.11 and master as well so that we can close this ticket.
Changed on 01/30/17 at 01:37:09 by Michael Davidsaver
Ran rtems-tftp-test against aecb508673df0d047e055f452e3f2d9db17328c0
comment:20 Changed on 01/30/17 at 01:41:17 by Michael Davidsaver
atm I only have the 4.9 and 4.10 toolchains built. Running my rtems-tftp-test against aecb508673df0d047e055f452e3f2d9db17328c0 appears to succeed. So this issue is fixed at least on the 4.10 branch.
comment:21 Changed on 01/30/17 at 06:35:21 by Sebastian Huber
Ok, thanks for testing. In order to properly fix this issue we should:
- Add a TFTP sever to RTEMS
- Write a self-contained TFTP client/server tests similar to https://git.rtems.org/rtems/tree/testsuites/libtests/ftp01/init.c
- Fix this bug in 4.11 and master if necessary.
comment:22 Changed on 02/11/17 at 11:40:51 by Tanu Hari Dixit
I was trying to test this in master and faced a problem. In https://github.com/mdavidsaver/rtems-tftp-test/blob/master/init.c why isn't libio.h included when macros RTEMS_FILESYSTEM_TYPE_TFTPFS, RTEMS_FILESYSTEM_READ_WRITE are used here? Also when I run make, the compiler shows error:
o-optimize/netconfig.o:(.data.loopback_config+0x4): undefined reference to `rtems_bsdnet_loopattach' collect2: error: ld returned 1 exit status make: *** [o-optimize/tftp-test.exe] Error 1
I couldn't find rtems_bsdnet_loopattach been defined in any header file. If this a typical driver attach function, why isn't rtems_bsdnet_ifconfig passed in the function signature? What am I missing?
comment:23 Changed on 02/13/17 at 06:51:52 by Sebastian Huber
This rtems_bsdnet_loopattach() was a hack and is no longer needed in RTEMS 4.11.
comment:24 Changed on 01/25/18 at 05:38:33 by Chris Johns
I believe this ticket has been fixed and can be closed?
Looking through the VCS history I think this issue was fixed in 2012 as
a consequence of 3b7c123c8d910eb60ab3b38dec6224e2de9847c9. I haven't confirmed this since EPICS doesn't build
against the VCS master branch.
I have confirmed that it isn't fixed on the 4.10 branch.