Jump to content

MBNet issues (custom build)


Sauraen
 Share

Recommended Posts

I'm trying to get basic MBNet communication set up between two LPC17 cores. I initialize MBNet with MBNet_Init(0) and then MBNet_NodeIDSet(), and then I have a RTOS task set up to call MBNet_Handler() every ms. On initialization I get the message "[MBNET] ERRORs detected - permanent off state reached!", which I assume means something in init failed, but I can call MBNet_Init() again and get a whole lot of "[MBNET] sent REQ: ID=[etc.]" which all look correct, and which looks like it's actually sending data. According to my cheap pocket oscilloscope there is no data on the CAN pin at all, it's just +5V (whether or not it's connected to the other core, and even while it's sending lots of REQs). Of course, the other core running identical code but with a different master node ID does not react in any way to the messages I'm trying to send (or I wouldn't be asking the question!).

 

Is there some sort of software switch I have to turn on to use MBNet, like MIOS32_MBNET_ENABLE or something like that? Otherwise, what could be going on?

Link to comment
Share on other sites

Permanent off state will be reached if one core tries to send messages to another core and doesn't get acknowledges. Depending on the number of missed acknowledges, it will first go into error passive, and later into bus off state where no message can't be sent anymore until the CAN peripheral will be re-initialized.

You won't find this handling in the mbnet driver, because it's implemented in the CAN peripheral (based on the CAN spec, see: http://esd.cs.ucr.edu/webres/can20.pdf chapter 7 "Fault Confinement")

 

What does this mean: the execution of MBNET_Handler() should be delayed by 2..3 seconds after power-on to ensure that all CAN nodes are ready and could acknowledge incoming messages.

 

In addition, it makes sense to add a way to re-initialize the CAN if it goes into bus-off state (you already noticed, that this helps)

 

For example, in the MBSID V2 firmware the CAN will be re-initialized, if the user pushes one of the SID buttons while CAN is in busoff state. This also re-starts the scan for MBNET nodes -> hot plug & play! :)

 

Best Regards, Thorsten.

Link to comment
Share on other sites

All right, so I should still have both cores call MBNet_Init(0) and MBNet_NodeIDSet() immediately upon power-up, but then wait a few seconds before calling MBNET_Handler()? If it does go into bus off state, will calling MBNet_Init(0) again turn it back on?

 

The core only responds to incoming messages in MBNET_Handler(), so it won't respond to the other core if it isn't yet calling MBNET_Handler() because it's still waiting the three seconds. But if it has called MBNET_Handler() first, the other core won't respond and it'll turn off. I'm having a little trouble understanding how a core can only turn MBNet fully "on" after receiving acknowledges from another core, but it can only send acknowledges once it's fully "on"!

Edited by Sauraen
Link to comment
Share on other sites

"Acknowledge" could be confusing here, because actually we've two different types: the ACK of the CAN hardware protocol (and was writing about this one), and the ACK of the MBNET software layer.
 
However, I wrote a small demo application to test the communication between two cores by myself.
I noticed, that the wrong pinning was selected in modules/mbnet/LPC17xx/mbnet_hal.c, did you notice this as well? If not, then we know why it didn't work.
The default selection is for the MBHP_CORE_LPC17 board now.
I also added some support functions for the MIOS Terminal to simplify debugging.
Please update your repository.
 
The new demo application can be found under apps/examples/mbnet
 
Debug messages are enabled by default (verbose level 3).
 
After startup the application will scan for available nodes between ID 0x00..0x07 (configured in mios32_config.h)
 
Enter "help" for available commands:

[16602.262] Welcome to MBNET Example!
[16602.262] Following commands are available:
[16602.262]   mbnet:                            prints status informations
[16602.262]   mbnet_reconnect:                  (re-)scans for MBNET nodes on the bus
[16602.262]   set mbnet_id <0x00..0x7f>:        changes my MBNET ID (current ID: 0x10)
[16602.262]   set mbnet_verbose <0..4>:         enables MBNET debug messages (verbose level: 3)
[16602.263]   reset:                            resets the MIDIbox (!)
[16602.263]   help:                             this page

 
E.g. with "mbnet" I'm getting:

[16643.012] mbnet
[16643.014] My MBNET ID: 0x10
[16643.014] MBNET State: running
[16643.014] MBNET Scan State: finished
[16643.014] Slave # 1: not found
[16643.014] Slave # 2: not found
[16643.014] Slave # 3: not found
[16643.014] Slave # 4: not found
[16643.014] Slave # 5: ID 0x04  P:1 T:ABCD V:1.0
[16643.014] Slave # 6: not found
[16643.014] Slave # 7: not found
[16643.014] Slave # 8: not found
[16643.014] MBNET Verbose Level: 3

 
 
I'm using two instances of MIOS Studio, connected to the two cores.
The MBNET ID of the second core has been changed via "set mbnet_id 4"
Then I entered "mbnet_reconnect" on the first core:
 
mbnet_debug.png
 
I also tested the communication with a MBSID V2 (PIC based), and it works:

[16937.869] mbnet
[16937.872] My MBNET ID: 0x10
[16937.872] MBNET State: running
[16937.872] MBNET Scan State: finished
[16937.872] Slave # 1: ID 0x00  P:1 T:SID  V:2.0
[16937.872] Slave # 2: ID 0x01  P:1 T:SID  V:2.0
[16937.872] Slave # 3: ID 0x02  P:1 T:SID  V:2.0
[16937.872] Slave # 4: ID 0x03  P:1 T:SID  V:2.0
[16937.872] Slave # 5: not found
[16937.872] Slave # 6: not found
[16937.872] Slave # 7: not found
[16937.872] Slave # 8: not found
[16937.872] MBNET Verbose Level: 3

 
So, I guess that it should work properly at your side as well...
 
Best Regards, Thorsten.

Link to comment
Share on other sites

It being the wrong pin would explain me not seeing data on that pin! Now my cheap scope shows a peak-to-peak voltage of 2.0 V when set to 50 ns, even though I still can't see the data.

 

But it's still not working. The app compiles fine and seems to work fine, except neither core sees the other. The only things I changed are a) the MBNet ID of each core, making one 0x00 and one 0x20 and compiling a version for each; and this in mios32_config.h:

 

#define MBNET_SLAVE_NODES_MAX 33
#define MBNET_SLAVE_NODES_BEGIN 0x00
#define MBNET_SLAVE_NODES_END   0x20

 

Once both were running, I told each to refresh, even changing the MBNet ID of each at runtime. Both continued to refuse to see each other. I managed to reboot one core and refresh MIOS32 in time to catch this on the terminal output--this was while the other core's ID was set to 0x01:

 

[6071.315] reset
[6075.392] [MBNET] ----- initialized -----------------------------------------------------
[6075.393] [MBNET] sent REQ: ID:0x01 TOS=3 DLC=0 MSG=00 00 00 00 00 00 00 00
[6075.393] [MBNET] ERRORs detected - permanent off state reached!
[6075.394] [MBNET] sent REQ: ID:0x02 TOS=3 DLC=0 MSG=00 00 00 00 00 00 00 00
[6075.395] [MBNET] sent REQ: ID:0x03 TOS=3 DLC=0 MSG=00 00 00 00 00 00 00 00

 

So it looks like it's seeing something from the other core, but immediately turning off. [see edit]

 

Edit: Also, thanks for your quick work on this!

 

Edit: I managed to get two instances of MIOS32 open without breaking the kernel MIDI pipeline :smile:, and can see the output from both cores. For both, as soon as they send their first REQ, the error message comes back, even though one core is 0x10 and one is 0x00. So it's not that it's seeing the other core but getting a badly formatted response or something--it's failing when it's trying to send the message.

 

Edit: Sorry, last edit! Same problem happens whether or not the actual CAN cable is connected between the cores. :(

Edited by Sauraen
Link to comment
Share on other sites

I tried the same slave node range, and it still works at my side.

 

To the debug messages: what kind of computer are you using? On a Windows PC MIDI can get unstable if only the M$ device driver is used. With MacOS the communication is very stable.

 

Beside of me, you are the only guy who tried MBNET yet, so that I haven't enough experiences what could go wrong at the user side.

As far as I can say, the described behaviour indicates a hardware connection issue.

Check that D1 is soldered in the right direction, and that R11 is mounted.

 

It could make sense to remove R11 on one board, because actually the so called "Wired AND" should only have a single pull-up resistor. See also this schematic, how the CAN interfaces have to be connected for multiple PICs: http://www.ucapps.de/midibox_sid/mbsid_v2_communication.pdf

 

So, at my side it works with R11 connected on both boards (so that the resulting resistance is 500 Ohm instead of 1k), but it could be that the lower resistance reduces the signal level so much, that the communication is failing at your side.

 

Best Regards, Thorsten.

Link to comment
Share on other sites

I'm using Ubuntu Linux AMD64, but I also have Windows 7 and access to a laptop with Mac OS X (Leopard?) if that is necessary. I only occasionally get the "(1 ignorable errors during upload)" message, so I think it's pretty stable; though I recently have been getting "Error 14: Bad SysEx message" or something when uploading to one of my cores. Putting it into bootloader mode fixes that.

 

Anyway, to get back to the MBNET: I have three cores with different sections of the board stuffed for each (as per the requirements of my build), and I checked the CAN hardware on all three. It seems to be fine (continuity from LPCXPRESSO module, correct 1k on each board, correct diode voltage drop in correct direction, connected to +5--though should it be connected to +3.3 instead, since that's the core voltage?) I have been trying the MBNET application with different pairs of the three cores.

 

The following all happens whether or not the CAN wire is connected between the cores.

 

If I compile the application with the node ID being initialized to a slave node, in this case 0x05, the core starts up with "[MBNET] initialized------" and when I type "mbnet", says that MBNET is running (and that it can't find any slaves). If there are two cores like this connected with the wire, telling them both to reconnect does not make them see each other.

 

However, ff I change the node ID to a master node, in this case 0x00 or 0x10, or compile the application with this as its node ID:

 

[1992.086] set mbnet_id 0x10
[1992.089] MBNET ID changed to 0x10.
[1992.089] [MBNET] sent REQ: ID:0x00 TOS=3 DLC=0 MSG=00 00 00 00 00 00 00 00
[1992.089] [MBNET] ERRORs detected - permanent off state reached!
[1992.090] [MBNET] sent REQ: ID:0x01 TOS=3 DLC=0 MSG=00 00 00 00 00 00 00 00
[1992.091] [MBNET] sent REQ: ID:0x02 TOS=3 DLC=0 MSG=00 00 00 00 00 00 00 00
[1992.092] [MBNET] sent REQ: ID:0x03 TOS=3 DLC=0 MSG=00 00 00 00 00 00 00 00
[1992.093] [MBNET] sent REQ: ID:0x04 TOS=3 DLC=0 MSG=00 00 00 00 00 00 00 00

[etc.]

 

Once the core has been put in permanent off, nothing but resetting it turns on the CAN again. Even after changing the node ID back to a slave node and telling it to reconnect, it still says that the bus is permanent off.

 

This is complicated by the fact that once I get it working, I need all three cores to be masters!

Edited by Sauraen
Link to comment
Share on other sites

Any MacOS should be the preferred choice for debugging (if you are able to run this OS), for MIDI it's the most robust solution. Anyhow, I still believe that this issue is related to a connection problem. Unfortunately I'm currently not at home, otherwise I could create some snapshots from my scope to give you some reference waveforms... Let's try to describe the expected waveforms verbally: the CAN communication will only work if at least two nodes are connected to the bus. The node which sends a CAN frame expects that all other nodes send an acknowledge pulse (I'm speaking about the physical layer, and not the MBNET layer, where ACK has a different meaning...) So, if the sender doesn't get the acknowledge pulse at the end of the frame, it will retry automatically... and give up after a certain number of retries and go into Bus Off state, where it does't send anymore until re-initialization. It could be, that the re-initialization currently only takes place after an application reset, I haven't checked this at my side yet (and won't be able to check this in the next week...) What is expected on the scope: all Rx pins should always show the complete CAN frame (let's say: a lot of pulses...) The Tx pin of the sending node should show almost the complete CAN frame as well... Just the last pulse - the ACK - should be missing. The Tx pin of the receiving node should be idle (logic-1) while the other node sends a frame. But at the end of the frame it should send the ACK pulse. It's very easy to debug this with a 2 channel scope, just probe both Tx pins - and alternatively one Tx and one Rx pin (do you need further descriptions, or is it already clear enough?) Best Regards, Thorsten.

Sorry, iPad removes the linebreaks :-(

Link to comment
Share on other sites

Thanks, that makes more sense, I didn't realize there was a hardware-level acknowledgement signal. Your explanation is nice and clear--I just need a better scope! I'll see what I can do with my cheap one though. :(

 

Edit: Well, that didn't take too long!

 

gallery_10357_139_245324.jpg

 

With three cores connected, the current draw was such that the resistance in the fuse between the 3.3V output of my PSU and the 3.3V rail of the synth made the core voltage drop to 2.8V. Interestingly, it was like this while I was recording the last video last weekend, and the SID core controlled the OPL3 fine at 2.8V, but evidently that wasn't enough for MBNET to work. I upped the size fuse and the cores started getting 3.25V again, and after a little resetting and fiddling (it works with two cores' pull-up resistors but not three, so I cut one off), they started seeing each other! Now for some control of some OPL3 parameters, and finishing the OPL3 driver (so you can start on the application layer for MIDIbox FM 2.0! :P )

 

Thanks for the help! Bought you a beer. :)

Edited by Sauraen
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

×
×
  • Create New...