Code Monkey home page Code Monkey logo

riscv-assembler's Introduction

Hi, I'm Kaya! ๐Ÿ‘‹

I'm a Wealth Management Technology Analyst at Morgan Stanley. I recently graduated from Duke University with a B.S in Computer Science and Statistical Science. I was a REU research student in Columbia University's EE department in the WiMNet Lab as well as a research intern in Duke ECE's lab.

I love tinkering with idea/concepts related to my professional field (machine learning, finance, time-series analysis), but also others that are completely unrelated (RISC-V, airline pricing, computational linguistics, cartography, etc). Feel free to explore my past and ongoing projects!

Ongoing Projects - celebi-pkg ๐Ÿ—บ

Here are a list of ongoing projects that will be updated frequently on GitHub.

riscv-assembler Python Package (Documentation) โš™๏ธ

This is a Python package that is currently available to use. The package provides tools for converting RISC-V Assembly to machine code. It has some useful tools such as converting whole files into machine code as well as analyzing individual instructions. While a version is currently up and running on PyPi, I continue to update it and add more features. Check it out and let me know how it works and if it needs something extra!

DJI-VXX ๐Ÿ“ˆ

A regression model that can predict the Dow Jones Industrial Average and the Volatility Index based on Wall Street Journal news headlines. Check out the updates here

Google Flight Analysis (Documentation)๐Ÿ›ฉ

This project provides tools and models for users to analyze, forecast, and collect data regarding flights and prices. There are currently many features in initial stages and in development. The current features (as of 8/29/22) are:

  • Scraping tools for Google Flights
  • Base analytical tools/methods for price forecasting/summary
  • Models to demonstrate ML techniques on forecasting
  • API for access to previously collected data

Check out the updates here

Other Projects

riscv-assembler's People

Contributors

annestrand avatar kcelebi avatar penguinliong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

riscv-assembler's Issues

Incorrect translation of `BEQ`?

Assembly code

addi s0 x0 10
addi s1 x0 10

loop:
	addi s1 x0 -1 ; I know it's the same as s1 = -1
	beq s1 x0 out
	beq x0 x0 loop

out:
	addi s1 s0 -32

Assembling it

from pathlib import Path
from riscv_assembler.convert import AssemblyConverter

BASE_PATH = Path("code")
conv = AssemblyConverter(output_type="bt")
conv.convert(str(BASE_PATH / "loop.s"))

Instructions in binary

This is the .txt file produced by convert:

00000000101000000000010000010011
00000000101000000000010010010011
11111111111100000000010010010011
00000000000001001000010001100011
01111100000000000000101011100011
11111110000001000000010010010011

Issue

The beq x0 x0 loop instruction seems to be encoded incorrectly. According to chapter 19 "RV32/64G Instruction Set Listings" of the spec, this is a B-type instruction whose encoding is as follows:

0 111110 00000 00000 000 1010 1 1100011
^ ^^^^^^ ^^^^^ ^^^^^     ^^^^ ^
| |      |     rs1       |    |
| |      rs2             |    imm[11]
| imm[10:5]              imm[4:1]
imm[12]

Thus, the immediate bytes are:

  • imm[4:1] = 1010
  • imm[10:5] = 111110
  • imm[11] = 1
  • imm[12] = 0

According to section 2.3 "Immediate Encoding Variants", imm[12] is the sign of the immediate, so since imm[12] = 0, we have a positive offset, so beq x0 x0 loop will jump forward, even though it's supposed to jump backward, back to the loop label.

The immediate is:

0000 1 111110 1010 0 = 4052

So we'll jump 4052 bytes forward???

Furthermore, RARS provides a different encoding for this instruction:

v--- different leading bit
1 111111 00000 00000 000 1100 1 1100011 <- RARS
0 111110 00000 00000 000 1010 1 1100011 <- this assembler

RARS's immediate is 1111 1 111111 1100 0 = -8, so that's a jump 8 / 2 = 4 bytes back, so 2 instructions back, which leads to the loop label, which makes sense.

Using LUI with hexadecimal throws KeyError: '0x4000'

I'm guessing the assembler doesn't seem to support hexadecimal. Because when I try: addi x1, x1, 0x0
It says ValueError: invalid literal for int() with base 10: '0x0'.
Please add support for hexadecimal values.

how do you deal the .data segment and lui instruction?

In your code, instructions starting with "." are ignored, so how should the data segment in an assembly file be handled? Additionally, when it comes to the LUI instruction, it is usually accompanied by the "hi%" symbol, but I didn't see any handling of that symbol in the code.

lb,lh not working

Hi,
there is an issue when parsing lb and lh instructions:

   lw s2, 8(sp) #works
   lb s2, 8(sp) #throws error
   lh s2, 8(sp) #throws error

the issue is in line 454 of convert.py:

elif clean[0] == "lw":  #<--lb and lh should be added
	res.append(self.I_type(clean[0], self.__reg_map(clean[3]), clean[2], self.__reg_map(clean[1])))

it works when changing this line:

elif clean[0] == "lw" or clean[0] == "lb" or clean[0] == "lh":
	res.append(self.I_type(clean[0], self.__reg_map(clean[3]), clean[2], self.__reg_map(clean[1])))

BR,
David

Incorrect conversion of SW with offset

I found this bug recently:

sw x1, 4(x0) converted into 0x00102423, but should have been 0x00102223

I validated this with two online assemblers and by working through it by hand. I deduced that the immediate value is off by 1 place. Which looks to be true with my surface level testing of it.

Relative jump addresses calculated incorrectly

Hi,
in function calcJump(self,lineNumber) the relative addresses of labels are calculated incorrectly. More specifically, labels before the label of interest are counted as well, leading to jumps further than anticipated.

....
addi t0, x0, 1
blt a1, t0, exit7 <---- this address is calculated correctly
j loop_start <-----this one is one to high due to
exit7: <-----this label
li a1, 7
j exit2

loop_start:
addi t0, x0,0 #i =0
addi t1, x0, 0, #max val <---- jumps to this instruction
addi t2, x0, 0 #iteration comp
....

BR,
David

Incorrect conversion of BLT instruction

While converting the following sample assembly code

addi x2, x0, 1

loop:
   sub x1, x1, x2
   sw  x1, 4(x0) 
   blt x0, x1, loop

I got

0x00100113
0x402080b3
0x00102223
0x7c104ae3

the expected result was

00100113
402080b3
00102223
fe104ce3

(notice the conversion in the 4th line)

Aravind

FileExistsError, first time assembly testing was ok but not on 2nd time onward

hi, this is log,

C:\Users\user0\riscv-assembler\tests\assembly>python
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 21:26:53) [MSC v.1916 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from riscv_assembler.convert import AssemblyConverter
>>> cnv = AssemblyConverter()
>>> cnv.convert("test0.s")
-----Writing to binary file-----
Output file: test0.bin
Number of instructions: 1
>>> #outputs to binary file simple.bin
...
>>> from riscv_assembler.convert import AssemblyConverter
>>> cnv = AssemblyConverter()
>>> cnv.convert("test0.s")
-----Writing to binary file-----
Output file: test0.bin
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\user0\AppData\Local\Programs\Python\Python37-32\lib\site-packages\riscv_assembler\convert.py", line 589, in convert
    return self.__post()
  File "C:\Users\user0\AppData\Local\Programs\Python\Python37-32\lib\site-packages\riscv_assembler\convert.py", line 533, in __post
    os.mkdir(f"{fname[:-2]}/bin")
FileExistsError: [WinError 183] File/Folder existed, could not create the file or folder: 'test0/bin'
>>> #outputs to binary file simple.bin

IndexError while running AssemblyConverter().convert(file)

I was just using the example from the documentation

from riscv_assembler.convert import AssemblyConverter
cnv = AssemblyConverter(output_type="bp")
cnv.convert("test.s")

but with my own assembly code. Then this index error raises:

IndexError                                Traceback (most recent call last)
<ipython-input-2-5a83df4fad26> in <module>
      1 from riscv_assembler.convert import AssemblyConverter
      2 cnv = AssemblyConverter(output_type="bp")
----> 3 cnv.convert("test.s")

~/<dir>/python3.9/site-packages/riscv_assembler/convert.py in convert(self, filename)
    578                 self.filename = filename
    579                 self.code = self.__read_in_advance()
--> 580                 self.instructions = self.__get_instructions()
    581 
    582                 if self.hexMode:

~/<dir>/python3.9/site-packages/riscv_assembler/convert.py in __get_instructions(self)
    413                         line = self.code[i]
    414 
--> 415                         response = self.__interpret(line,i)
    416                         if -1 not in response:
    417                                 instructions.extend(response)

~/<dir>/python3.9/site-packages/riscv_assembler/convert.py in __interpret(self, line, i)
    458                         #print(res)
    459                 elif clean[0] in self.S_instr:
--> 460                         res.append(self.S_type(clean[0], self.__reg_map(clean[3]), self.__reg_map(clean[1]), clean[2]))
    461                         #print(res)
    462                 elif clean[0] in self.SB_instr:

IndexError: list index out of range

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.