El-Din, M. M., H. A. H. Fahmy, Y. Ismail, N. Gamal, and H. Mostafa, "Leakage Power Evaluation of {F}in{FET}-Based {FPGA} Cluster Under Threshold Voltage Variation", The 11th International Design and Test Symposium, {H}ammamet, {T}unisia, 2016. Abstract
Elashkar, N., M. Aboudina, H. A. H. Fahmy, G. H. Ibrahim, and A. H. Khalil, "Memristor based {BPSK} and {QPSK} Demodulators with Nonlinear Dopant Drift Model", Microelectronics Journal, vol. 56, pp. 17–24, 2016. AbstractWebsite

In this paper, the dependence of the instantaneous memristance value and its I–V characteristics on a periodic signal phase are studied. Hence, expression for the instantaneous memristance as a function of the periodic input phase is derived. This derivation is based on the memristor linear dopant drift model and is provided for sinusoidal input waveforms. To prove the tendency, simulations using linear and nonlinear dopant drift memristor models are performed in the Cadence simulation environment. Based on those, a set of digital communication demodulators are proposed and investigated exploiting the change of the average memristance with the initial phase of applied signal. The experimental-based `nonlinear' dopant drift model is used in designing the proposed demodulators for Binary Phase Shift Keying (BPSK) and Quadrature Phase Shift Keying (QPSK) modulation schemes. Since all proposed demodulators are asynchronous, the proposed circuits do not need any carrier recovery circuits. Moreover, transient simulations have been executed showing the proper matching to the expected performance.

Radwan, A. G., W. S. Sayed, and H. A. H. Fahmy, "Double-Sided Bifurcations in Tent maps: Analysis and Applications", The 3rd {I}nternational {C}onference on {A}dvances in {C}omputational {T}ools for {E}ngineering {A}pplications ({ACTEA}), {L}ebanon, 2016. Abstract
A. Wahba, A., and H. A. H. Fahmy, "Area Efficient and Fast Combined Binary/Decimal Floating Point Fused Multiply Add Unit", {IEEE} {T}ransactions on {C}omputers, vol. 66, no. 2, pp. 226–239, 2017. AbstractWebsite

In this work we present a new 64-bit floating point Fused Multiply Add (FMA) unit that can perform both binary and decimal addition, multiplication, and fused-multiply-add operations. The presented FMA has 6% less delay than the fastest stand-alone decimal unit and 23% less area than both binary and decimal units together. These results were achieved by the use of: 1) column by column reduction to reduce the partial products in the multiplier tree, 2) a new leading zeros detector that produces its output in base-3 to simplify the normalization shifting in the binary datapath, 3) the use of a redundant adder to perform the final addition, 4) using a new rounding-while-redundant technique to hide the rounding delay and remove it from the critical path, and 5) using a new simple conversion technique from redundant to binary/decimal.

Sayed, W. S., A. G. Radwan, A. A. Rezk, and H. A. H. Fahmy, "Finite Precision Logistic Map Between Computational Efficiency and Accuracy with Encryption Applications", Complexity, 2017. Abstract

Chaotic systems appear in many applications such as pseudo-random number generation, text encryption and secure image transfer. Numerical solutions of these systems using digital software or hardware inevitably deviate from the expected analytical solutions. Chaotic orbits produced using finite precision systems do not exhibit the infinite period expected under the assumptions of infinite simulation time and precision. In this paper, digital implementation of the generalized logistic map with signed parameter is considered. We present a fixed-point hardware realization of a Pseudo-Random Number Generator using the logistic map that experiences a tradeoff between computational efficiency and accuracy. Several introduced factors such as the used precision, the order of execution of the operations, parameter and initial point values affect the properties of the finite precision map. For positive and negative parameter cases, the studied properties include bifurcation points, output range, maximum Lyapunov Exponent, and period length. The performance of the finite precision logistic map is compared in the two cases. A basic stream cipher system is realized to evaluate the system performance for encryption applications for different bus sizes regarding the encryption key size, hardware requirements, maximum clock frequency, NIST and correlation, histogram, entropy and Mean Absolute Error analyses of encrypted images.

Sayed, W. S., H. A. H. Fahmy, A. A. Rezk, and A. G. Radwan, "Generalized Smooth Transition Map Between Tent and Logistic Maps", International Journal of Bifurcation and Chaos, vol. 27, no. 01, pp. 1730004, 2017. AbstractWebsite

There is a continuous demand on novel chaotic generators to be employed in various modeling and pseudo-random number generation applications. This paper proposes a new chaotic map which is a general form for one-dimensional discrete-time maps employing the power function with the tent and logistic maps as special cases. The proposed map uses extra parameters to provide responses that fit multiple applications for which conventional maps were not enough. The proposed generalization covers also maps whose iterative relations are not based on polynomials, i.e., with fractional powers. We introduce a framework for analyzing the proposed map mathematically and predicting its behavior for various combinations of its parameters. In addi- tion, we present and explain the transition map which results in intermediate responses as the parameters vary from their values corresponding to tent map to those corresponding to logistic map case. We study the properties of the proposed map including graph of the map equation, general bifurcation diagram and its key-points, output sequences, and maximum Lyapunov exponent. We present further explorations such as effects of scaling, system response with respect to the new parameters, and operating ranges other than transition region. Finally, a stream cipher system based on the generalized transition map validates its utility for image encryption applications. The system allows the construction of more efficient encryption keys which enhances its sensitivity and other cryptographic properties.

Sayed, W. S., and H. A. H. Fahmy, "What are the Correct Results for the Special Values of the Operands of the Power Operation?", {ACM} Transactions on Mathematical Software, vol. 42, no. 2, New York, NY, USA, ACM, pp. 14:1–14:17, may, 2016. AbstractWebsite

Language standards such as C99, C11, as well as the IEEE Standard for Floating-Point Arithmetic 754 (IEEE Std 754-2008) specify the expected behavior of binary and decimal floating-point arithmetic in computer programming environments and the handling of special values and exception conditions. Many researchers focus on verifying the compliance of implementations for binary and decimal floating-point operations with these standards. In this article, we are concerned with the special values of the operands of the power function $Z = X^Y$. We study how the standards define the correct results for this operation, propose a mathematically justified definition for the correct results of the power function on the occurrence of these special values as its operands, test how different software implementations for the power function deal with these special values, and classify the behavior of different programming languages from the viewpoint of how much they conform to the standards and our proposed mathematical definition. We present inconsistencies between the implementations and the standards and we discuss incompatibilities between different versions of the same software.

M. Hassan, A., H. A. H. Fahmy, and N. H. Rafat, "Enhanced Model of Conductive Filament-Based Memristor via including Trapezoidal Electron Tunneling Barrier Effect", {IEEE} {T}ransactions on {N}anotechnology ({TNANO}), vol. 15, no. 3, pp. 484–491, 2016. AbstractWebsite

Memristors exhibit very promising features such as nonvolatility and small area. Several types of memristors have been developed in the last decade using different materials along with physical models explaining their behaviors. In this paper, we modify a previously published model to account for a trapezoidal electron tunneling barrier rather than a zero field or constant potential barrier. The model is verified against experimental data showing better agreement. We then perform a study to find out the effect of different memristors parameters on its I-V characteristics and how to shape the characteristics to fit the applications. Finally, we provide a SPICE model which takes into account the tunneling capacitance and clarify that any fabricated memristor has, inherently, a memcapacitor in parallel. The dominant element may be the memristor or the memcapacitor depending on the frequency of operation.

Gamal, N., H. A. H. Fahmy, Y. Ismail, and H. Mostafa, "Design Guidelines for Embedded {NoCs} on {FPGAs}", The 17th {IEEE} International Symposium on Quality Electronic Design ({ISQED}), {S}anta {C}lara, {CA}, {USA}, 2016. Abstract

Including Networks-on-Chip (NoCs) within FPGAs has become necessary to overcome the problems of point-to-point interconnect scheme. This will enable interfacing with high speed IOs and partial dynamic reconfiguration (PDR), and reduce compile time and improve system performance. We compared FPGA-specific NoC components on soft and hard implementations and analyzed the efficiency gap between the two technologies to get design constraints in this space. Input module that includes memory buffers, implemented using block RAMs (BRAMs), has less 1.8x area, 2.9x delay and 5.3x power. Switch has the largest gap: 90x area, 7x delay and 53x power. If the router is totally hard implemented, this will save 9x area, 3.7x delay and 12x power. By comparing our results with same flow on ASIC-specific router, we show that using FPGA-specific NoCs design improves utility with 3x in area with slight increase in delay.

Zidan, M. A., H. Omran, R. Naous, A. Sultan, H. A. H. Fahmy, W. D. Lu, and K. N. Salama, "Single-Readout High-Density Memristor Crossbar", Scientific Reports, vol. 6, 2016. AbstractWebsite

High-density memristor-crossbar architecture is a very promising technology for future computing systems. The simplicity of the gateless-crossbar structure is both its principal advantage and the source of undesired sneak-paths of current. This parasitic current could consume an enormous amount of energy and ruin the readout process. We introduce new adaptive-threshold readout techniques that utilize the locality and hierarchy properties of the computer-memory system to address the sneak-paths problem. The proposed methods require a single memory access per pixel for an array readout. Besides, the memristive crossbar consumes an order of magnitude less power than state-of-the-art readout techniques.